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Abstract 



High-energy 7-ray astronomy was revolutionized in 1991 with the launch of the Ener- 
getic Gamma-Ray Experiment Telescope (EGRET) on board the Compton Gamma- 
Ray Observatory. In addition to unprecedented instrument effective area and a narrow 
point-spread function, EGRET provided photon time-tagging to an absolute accuracy 
of 100 fis. The opportunity to analyze high-quality 7-ray data requires sophisticated 
statistical and analytic tools. Part | describes the analysis of periodic and transient 
signals in EGRET data. A method to search for the transient flux from 7-ray bursts 
independent of triggers from other 7-ray instruments is developed. Several known 
7-ray bursts were independently detected, and there is evidence for a previously un- 
known 7-ray burst candidate. Statistical methods using maximum likelihood and 
Bayesian inference are developed and implemented to extract periodic signals from 
7-ray sources in the presence of signiflcant astrophysical background radiation. The 
methods allow searches for periodic modulation without a priori knowledge of the 
period or period derivative. The analysis was performed on six pulsars and three pul- 
sar candidates. The three brightest pulsars. Crab, Vela, and Geminga, were readily 
identifled, and would have been detected independently in the EGRET data without 
knowledge of the pulse period. No signiflcant pulsation was detected in the three 
pulsar candidates. Furthermore, the method allows the analysis of sources with peri- 
ods on the same order as the time scales associated with changes in the instrumental 
sensitivity, such as the orbital time scale of GGRO around the Earth. Eighteen X- 
ray binaries were examined. None showed any evidence of periodicity. In addition, 
methods for calculating the detection threshold of periodic flux modulation were de- 
veloped. 
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The future hopes of 7-ray astronomy he in the development of the Gamma-ray 
Large Area Space Telescope, or GLAST . Part |I| describes the development and re- 
sults of the particle track reconstruction software for a GLAST science prototype 
instrument beamtest. The Kalman filtering method of track reconstruction is intro- 
duced and implemented. Monte Carlo simulations, very similar to those used for the 
full GLAST instrument, were performed to predict the instrumental response of the 
prototype. The prototype was tested in a 7-ray beam at SLAG. The reconstruction 
software was used to determine the incident 7-ray direction. It was found that the 
simulations did an excellent job of representing the actual instrument response. 
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Chapter 1 
Introduction 



The desire to experiment with the extremes of nature is innate. While it is almost 
always amusing, it can be informative as well. Such is the case in the study of some of 
the most energetic photons produced in space: the 7-rays (Figure |1.1D . With at least 
ten million times the energy of ordinary optical photons, 7-rays represent a unique 
window into the most energetic processes in astrophysics — the (electromagnetically) 
roaring jets from active galactic nuclei, the arcing plasmas in the intense gravitational 
fields of pulsars, and the enigmatic and inordinately powerful explosions known as 
7-ray bursts. 

The immense value of 7-rays for astrophysics lies both in their role as telltale 
markers of large energy-generation processes and in their likelihood of passing un- 
perturbed through vast reaches of intergalactic space. Measuring the 7-ray spectra 
of astrophysical objects sharply constrains estimates of their total energy output. 
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Figure 1.1: The Electromagnetic Spectrum. 7-Rays occupy the highest energy 
extreme of the electromagnetic spectrum, from 10 MeV to over 300 GeV. 
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Since 7-rays are so energetic, many astrophysical sources emit the bulk of their total 
power output at these high energies. In addition, 7-rays travel relatively unimpeded 
through space. Since they carry no charge, they are nearly unaffected by galactic and 
intergalactic magnetic fields. Their small interaction cross section means that they 
are relatively unaffected by dust and gas in the intervening space between the source 
and the detector. A high-energy 7-ray can travel through the central plane of the 
Galaxy with only a 1% chance of being absorbed [Q. 7-Rays may be observed from 
Earth essentially unchanged since they left the distant violence in which they were 
created. 

However, every silver lining has its cloud. The Earth's atmosphere is very good at 
absorbing 7-rays. Unfortunately, it means that precise astrophysical 7-ray observa- 
tions must be done in space. The second difficulty in 7-ray astronomy is intrinsic to 
the energy production mechanisms that produce 7-rays in the first place. Because 7- 
rays are so energetic, most sources produce very few of them. Detectors must be very 
efficient to collect these rare photons, and at the same time be able to discriminate 
against the sea of undesirable charged particles trapped in the Earth's magnetic field 
and albedo 7-rays which are generated in the Earth's atmosphere. This discrimination 
requires background rejection on the order of one part in ~10^ or better. 

1.1 The 7- Ray Observatories 

Experience with terrestrial accelerator-based 7-ray detectors suggested that a spark 
chamber might be an effective astrophysical 7-ray detector. In the mid-1970s, SAS 2 
p9| and COS B proved the concept of a 7-ray satellite telescope, while discov- 
ering several of the brightest 7-ray sources. Simultaneously, NASA envisioned the 
Great Observatories for Space Astrophysics program: a series of satellite telescopes 
designed to give unprecedented insights into electromagnetic emission from infrared 
to 7-rays. Under the auspices of this program, the Space InfraRed Telescope Facil- 
ity {SIRTF) infrared telescope has been designed, the Advanced X-ray Astrophysics 



CHAPTER 1. INTRODUCTION 



4 



BATS^ let 



ADM MODULE 
|K0rVI51BLE) 



SEPJIUIICM 

• UNDBSTfluCTEIl INmuUEHT 

MFHiO BAVCWi^TRAIHTS 

• KEHSTlEHltSNETKITOREKIEft 

rrrERFEitEhcE 

• SOLAR An nAV<!t««>n(]rAti(n Mjtg,^-;i^|g 

• PASSIVE THERMAUCOriTFIOL 
PLUSKEAIEFtS 

AMb ANTEriUA5> MM I FIED 
PLKiKrHAH[WfAnE 
t Ut^ErECOMHUNICATIIVJS 
ADD OATA HAMDLIHG 
KCJOULE (CflJJHl 




Figure 1.2: The Compton Gamma-Ray Observatory. EGRET is the dome on the 
right end of the spacecraft. It is coahgned with COMPTEL. Rounding out the instru- 
ments aboard CGRO are OSSE, which is very sensitive to e~e~^ annihilation hues, 
and the omni-directional BATSE 7-ray burst detector. 



Facihty (AXAF) has been designed and built, the Hubble Space Telescope has rev- 
olutionized optical astronomy, and the Compton Gamma- Ray Observatory {CGRO) 
has explored the 7-ray sky from less than 0.1 MeV to more than 10 GeV (Figure |1.2|) . 

The five orders of magnitude in energy of the electromagnetic spectrum observed 
by CGRO require four different instruments on the satellite. The lowest energy 7- 
rays interact primarily through the photo-electric effect. The Oriented Scintillation 
Spectrometer Experiment (OSSE) covers the energy range from 0.05-10 MeV with a 
field of view of 3?8xll?4 [|78[. The Compton Telescope (COMPTEL) detects Compton 
scattered electrons, the most significant 7-ray interaction in the energy range between 
1 MeV and almost 30 MeV, to image the 7-ray sky with a field of view of ~1 sr 



178 1 . The Energetic Gamma- Ray Telescope Experiment (EGRET) measures pair- 



conversion events in a spark chamber, like 5*745' 2 and COS B. It is sensitive to energies 



between 20 MeV and 30 GeV, with a field of view of ~1 sr |7T|, 0. In addition, a 
fourth instrument aboard CGRO is optimized to detect 7-ray bursts. The Burst and 
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Transient Source Experiment [BATSE) consists of eight uncollimated detectors, one 
on each corner of the CGRO spacecraft, sensitive to 25 keV-2 MeV 7-rays with nearly 
uniform coverage of the sky . 

The EGRET instrument is the focus of Part | of this work. The instrument was 
built and operated by a collaboration of scientists at Stanford University, Goddard 
Space Flight Center (Greenbelt, Maryland), the Max Planck Institut fiir Extrater- 
restrische Physik (Garching, Germany), and the Grumman Aerospace Corporation 
(Bethpage, New York). It was launched aboard CGRO on the Space Shuttle Atlantis 
(STS-37) on April 5, 1991 and was deployed two days later. It was activated on 
April 15, and began taking data on April 20. The instrument and its characteristics 
have been extensively documented |7T|, ^ ^ |15U| , |193| ; we will briefly touch on the 
highlights relevant to data analysis in §E^. 



Future 7-ray telescopes will further extend our understanding of astrophysical 7- 
ray processes. The GLAST instrument, scheduled for launch in 2005 ||134| , |^, |1^, |15|, 
will improve upon the successes of EGRET as it brings 7-ray astronomy to the 21st 
century. The test of a GLAST science prototype will be the focus of Part || of this 



work. Details of the current instrument baseline design are given in §3.2 



1.2 The EGRET instrument 

Spark chambers detect 7-rays via pair production. Pair production refers to the 
process whereby a 7-ray converts to an electron-positron pair in the presence of 
matter. The detection process is more fully described in § p.l| . The resulting electrons 
and positrons are easily detected because they are charged particles. 



1.2.1 Instrumental Design 

To optimize the detection and resolution of 7-rays, the EGRET instrument consists 
of a series of thin tantalum (Ta) sheets interleaved with planes of conducting wires 
spaced by 0.8 mm. Below this multilayer spark chamber is a Nal(Tl) calorimeter 
known as the TASC (Total Absorption Shower Counter). Surrounding the spark 



CHAPTER 1. INTRODUCTION 



6 



ANTI-COINCIDENCE 
COUNTER 



CLOSELY SPACED 
SPARK CHAMBERS 




TIME 
OF FLIGHT 
COINCIDENCE 
SYSTEM 



Nal(TI) ENERGY 
MEASUREMENT 
COUNTER 



WIDELY SPACED 
SPARK CHAMBERS 



PRESSURE VESSEL 



Figure 1.3: The EGRET instrument. The total height of the spark chambers is 
approximately 1 m. 



chamber is a monolithic plastic scintillator to reject charged cosmic ray particles. 
The schematic design is shown in Figure 

99.5% of 7-rays will pass undetected through the anticoincidence scintillator into 
the 28 closely-spaced spark chamber modules, where it has a ~33% chance of con- 
verting to an e~e"'" pair. If it does so, the pair will ionize the (mostly) neon gas 
in the spark chamber along their trajectories. Below the closely-spaced modules is 
a time-of-fiight system designed to measure whether the particles are upward- or 
downward- moving. The system consists of two layers of 4x4 arrays of plastic scintil- 
lator tiles, spaced 60 cm apart. By combining measurements from the two layers, the 
time-of-fiight delay can be measured to within ~1.5 ns, and the general direction of 
the particles can be estimated. At any given time, a limited number of general direc- 
tions are considered valid for an instrument trigger. The valid directions depend on 
orbital parameters, and are designed to exclude the Earth's limb, as well as limiting 
the field-of-view when desired. 

If the time-of-fiight system registers the passage of a downward particle from a 
valid region of the sky, and the anticoincidence system has not been triggered by a 
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SAS 2 COS B EGRET GLAST 

Field of View 
Effective Area >100 MeV 
Angular Resolution^ 
Energy Resolution^ 
Point Source Sensitivity]^ 
(photons cm~^ s~^) 

"■RMS at 500 MeV 

^full-width, half maximum at 100 MeV 

c ft 
>100 MeV, 10 s exposure, unless noted 

'^1 year, high-latitude, >100 MeV, 5ct 

Table 1.1: Performance of four high-energy 7-ray telescopes. Continually improving 
technology is reflected in improving performance from the earliest instruments, SAS 2 
and COS B, through EGRET and on to the proposed GLAST instrument. 



0.25 sr 0.25 sr 

100 cm^ 70 cm^ 

1?5 1?5 
-100% 42% 

IQ~6 iQ-6 



1.0 sr 2.6 sr 

1200 cm2 ~7000 cm^ 

0?6 ~0?1 

18% 10% 

10-^ <4 X 10-90 



charged particle, a high voltage pulse is applied to the wires in the spark chamber 
modules. Ionized paths short the wires to ground, and the affected wires are recorded 
digitally in ferrite cores. 

The Total Absorption Spectrometer Calorimeter (TASC), located below the spark 
chamber, measures the total energy of the 7-ray event. In consists of 8 radiation 
lengths of Nal(Tl), and has an energy resolution of ~20% FWHM from a few tens of 
MeV to several tens of GeV. Events are tagged with an arrival time by the CGRO 
on-board clock to an absolute accuracy of 100 /is and a relative accuracy of 8 /is. The 
energy measurements made by the TASC are corrected on the ground for energy lost 
in the spark chamber and shower leakage. 



1.2.2 Instrumental Calibration and Performance 

Good calibration of the EGRET instrument was critical to the proper understanding 



of its data. The calibration was as extensive as its literature [|193| , p^ , pS] , |115| , |150 
only the results will be stated here. There are three areas in which we will need 
to know the instrument performance: the point-spread function, or the distribution 
of the measured 7-ray incident angles as a function of the true incident angle; the 
sensitive (or effective) area, or the physical area for collecting 7-rays multiplied by 



CHAPTER 1. INTRODUCTION 



8 



the efficiency, as a function of position on tfie sky at any given time; and tlie energy 
dispersion, or tlie distribution of measured energy as a function of the true energy. 

These three functions were measured and recorded in tabular form as a function 
of aspect angle and energy. Their use in data analysis will be discussed in § p.2| . 
A reasonable approximation to the point-spread width assumes a relatively simple 
functional form. The half-angle which defines a cone containing ~68% of the 7-rays 
from a point on the sky may be taken as 

^68 = 5?85(^/100 MeV)"°-^3^ (1.1) 



where E is the energy in MeV ||193|| . 



The sensitive area and energy dispersion are not easily expressible in functional 
form. Tables of their values were created in machine readable form, and analysis 
programs access them directly. 

The performance of the EGRET instrument compares very favorably with its 
predecessors, SAS 2 and COS B. The order of magnitude increase in effective area 
and improved point-spread function lead to the order of magnitude improvement in 
the point source sensitivity. A comparison of the telescopes, along with the proposed 
GLAST telescope, is shown in Table 

1.3 Successes with EGRET 

The Compton Gamma-Ray Observatory has been very fortunate in successfully achiev- 
ing and exceeding its design goals. Still operating some 7 years after launch, it has 
almost quadrupled its planned lifetime. While the entire observatory has been crit- 
ical in advancing our understanding of astrophysics from 15 keV to 30 GeV (most 
notably the shocking revelation from BATSE that the mysterious 7-ray bursts are 
isotropically distributed on the sky), we will concentrate here on the contributions 
made by EGRET, with a view toward future advancements to be made by GLAST. 



EGRET has significantly improved our understanding of pulsars [|3, 191]. Six 



7-ray pulsars have been identified by EGRET, and their pulse periods have been 
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measured. EGRET observations of pulsars will be discussed in Chapter ^. Signif- 
icant advancements have been made through observations of the Galactic and 
extragalactic ||212|| diffuse background emission. The largest number of identified 
EGRET sources are the 7-ray blazars. Roughly 60 blazars have been identified above 
100 MeV, leading to new insights into blazar emission mechanisms ||6^. While the 
BATSE 7-ray burst measurements were the most revolutionary discovery, EGRET 
has made significant contributions to our understanding of 7-ray bursts at the highest 
energies [p.22|] . 7-Ray bursts will be discussed in Chapter ^. 

GGRO has yielded a wealth of information about the 7-ray sky. Much of that 
information has directly increased our understanding of astrophysical systems. In 
particular, EGRET has given us an unprecedented view of the high-energy 7-ray sky. 
EGRET has identified a great number of new sources; the launch of GLAST will give 
us a tool to understand what exactly it is that we have found. 



Chapter 2 

Statistical Methods in 7-Ray 
Astronomy 

Optical astronomy has long been famous for breathtaking images of distant galax- 
ies, star-forming clouds, and beautiful nebulae. While 7-ray astronomy can produce 
equally beautiful results, the nature of the photons and the instrument yield data 
which must be analyzed very differently than data from other wavelengths. The first 
major difference is the sparsity of the photons; integration times of days are usually 
required to observe all but the brightest sources. A multitude of astrophysical condi- 
tions also affect the nature of the data. Cosmic ray interactions with Galactic dust 
and gas produce diffuse background 7-rays. Different energy generation mechanisms 
produce different spectral profiles across the energy range observed by EGRET. Sev- 
eral specific instrumental responses also shape the data analysis process. A finite 
point-spread function means that 7-ray directions will be determined to within a sta- 
tistical distribution around the true source direction. Instrumental sensitivity to three 
orders of magnitude in energy across the field of view of more than a steradian is not 
uniform as the platform orbits at 17,000 miles per hour. Meanwhile, our estimates 
of all these parameters depend on the 7-ray energy, which is only known to limited 
accuracy. 

Despite these impediments to observation, EGRET has been very successful in 
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Figure 2.1: All 7-rays above 100 MeV measured by EGRET in phase I, II, and III, 
binned at a scale of 0?5. 

making 7-ray observations. This chapter will examine the nature of the EGRET 
data, and the statistical methods used to analyze it. 



2.1 The Nature of the 7-Rays 

Several major features of the high-energy 7-ray sky are evident from the simplest 
possible examination. Figure P?^ shows the raw photon counts for the whole sky in 



0?5 x0?5 bins in Galactic coordinates. The Galactic center is evident at the center of 
the map, as is the Galactic disk. There are several bright spots in the plane, as well 
as some evident high-latitude sources. We will first examine the diffuse background, 
then see how statistical methods along with understanding of the background will 
help us identify point sources and estimate their locations and fluxes. 
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2.1.1 Diffuse Background 

Almost two decades before the launch of S AS 2, Hayakawa ||6^ predicted that high- 
energy 7-rays would be produced as cosmic rays interacted with interstellar gas yield- 
ing pions, which would decay, directly or indirectly, into 7-rays. Indeed, experience 
with SAS 2, COS B, and EGRET has shown strong correlations of the diffuse 7-ray 
background at low Galactic latitudes with known Galactic structural features such 



as the spiral arms |39| , |63| , |119|| . Based on indications that the diffuse 7-ray flux ob- 
served by SAS 2 and COS B was approximately the intensity and shape expected 
from cosmic ray interactions with interstellar matter, a model of the diffuse flux was 
made for the purposes of ECRET data analysis |Ty, [T5[| . 

There are three main processes by which diffuse 7-rays are generated. The domi- 
nant process above ~70 MeV is the decay of pions. Pions are produced when cosmic- 
ray protons interact with dust particles or gas. These pions then decay to high-energy 
7-rays. Below ~70 MeV, 7-rays can be produced by either bremsstrahlung of cosmic 
rays in interstellar clouds or by inverse-Compton upscattering of low-energy photons. 

A good model of the Galactic diffuse 7-ray intensity thus requires a good model of 
the distribution of interstellar matter in the Galaxy, and a good model of the cosmic 
ray flux throughout the Galaxy. The first may be well approximated using maps of 
the distribution of hydrogen, which comprises most of the interstellar matter in the 
Galaxy. Atomic hydrogen has been carefully mapped with observations of the 21 cm 
hyperfine transition line. Molecular hydrogen is more difficult to map, but may be 
approximated by assuming that CO is a good tracer, easily identified by its 2.6 mm 
emission line. A constant ratio of CO to molecular hydrogen is typically assumed 
throughout the galaxy. 

Much more difficult is the estimation of the Galactic cosmic ray flux. Since the 
flux cannot be directly measured, assumptions about the distribution of cosmic rays 



must be made. For the purposes of ECRET analysis, Bertsch [10| and Hunter |75 



assume that cosmic rays are in dynamic equilibrium with the interstellar magnetic 
pressure and the gravitational pressure of the galactic matter. These assumptions, 
convolved with the instrument point-spread function (§ p.4|) , result in the map of dif- 
fuse Galactic 7-rays shown in Figure 12^^, which is in good agreement with the observed 
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Figure 2.2: The assumed Galactic contribution to the diffuse 7-ray background, 
taken from measurements of atomic and molecular hydrogen, and convolved with the 
EGRET point-spread function. This is a map of Gij, as defined in equation ( 2.20|) . 



intensity map shown in Figure |2.4|. Of course, at high galactic latitudes the diffuse 



background is primarily extragalactic ||182|| , although there is significant galactic dif- 
fuse background at high latitude. Some or all of this extragalactic background is due 
to a large number of weak sources, while some of it may be due to a truly diffuse 



background ||212|. 



2.1.2 Spectral Differences 

EGRET^s broad energy range allows the spectra of sources and the diffuse background 
to be measured. The spectrum of almost every EGRET source is well fit by a power 
law. Since 7-rays are so sparse, the power laws are usually quoted as a function of 
the number of photons, instead of a function of energy as is sometimes used for other 
wavelengths. The form is then 



I{E) = Jo^"" photons cm-2 s'^ MeV" 



(2.1) 
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where I{E) is the differential photon flux. The spectral index a is close to 2.0 for most 
sources, though it can be as low as 1.42 for pulsars. The Galactic diffuse background 
has a bit softer spectrum — about 2.1. 

2.2 Instrumental Effects 

The quality and nature of the data depend equally on the photons and the instru- 
ment which observes them. Any attempt to analyze the data must consider the 
instrumental response as the basis for an analysis method. The three main aspects of 
the instrumental response that we must consider are the point-spread function, the 
sensitive area, and the energy dispersion. There is no reason, a priori, to expect that 
the instrument response function should separate cleanly into these three functions. 
However, it offers great simplification to the data analysis, and in practice seems to 
be a good approximation. 

2.2.1 Point-Spread Function 

It is important to be able to quantify the ability of a 7-ray telescope to correctly 
reconstruct the true incident direction of a 7-ray. To precisely define this, we will 
distinguish between the "point-spread density" and the "point-spread function." The 
point-spread density, or PSD, refers to the probability density distribution of incident 
7-ray directions measured by the instrument from a point source. This distribution 
may in general be a function the true position of the point source (the inclination 
and azimuth relative to the centerline of the telescope) and the energy of the 7-ray: 

PSD^PSD(e,(j);9o,(t>o,Eo) (2.2) 

where <Po represents the true source position and Eq is the true 7-ray energy. The 
apparent 7-ray direction is {9, (p). Often, we will require a probability, as opposed to a 
probability density. The differential probability of measuring a photon in a differential 
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element dO d(j) is given by 

dP = FSD{e, (j); Bo, 00, ^0) sin 6 dO dcj) (2.3) 

The point-spread function is the differential probability, often integrated over azimuth 
and energy to yield a function of inclination, as in equation (|2.6|) . 

The EGRET PSD was measured at the Stanford Linear Accelerator Center in 
1986. A beam of electrons with tunable energies between 650 MeV and 30 GeV was 
back-scattered off pulsed laser photons. 7-Rays were produced between 15 MeV and 



10 GeV by inverse-Compton scattering |115|. The point-spread density was measured 



as a function of apparent 7-ray position for 10 discrete energies, 5 inclination angles 
(^o) and 3 azimuthal angles (0o)- The resulting tables yield the relative probability 
of detecting a photon at (6', 0) assuming values of the other three parameters. In 
addition, EGRET operates in up to 87 different "modes," corresponding to different 
triggering criteria.^] These modes are designed to maximize the operating field of 
view, even when part of the geometric field of view is obscured by the Earth or its 
limb. 

2.2.2 Sensitive Area 

A second function which is clearly critical for data analysis is the sensitive area of 
instrument. The sensitive area (or effective area) is the projected area of the detector 
multiplied by its efficiency. It too is a function of incident 7-ray parameters: 

SA = SA(0o,0o,^o) (2.4) 

Clearly, the sensitive area of the instrument is also dependent on the instrument 
mode. 



* These different modes correspond to allowing only photons from certain broad regions of the 
sky as defined by coincidence of different combinations of time-of-flight tiles. 
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2.2.3 Energy Dispersion 

Finally, the analysis must consider energy dispersion. The energy dispersion function 
gives the distribution of measured energy for a given true energy. The measured 
energy varies from the true energy because of noise in the photomultipliers, fluctua- 
tions in the shower leakage from the calorimeter, and incomplete correction for energy 
losses elsewhere in the instrument. 

ED = ED(E;Eo,^o,0o) (2.5) 

Taken together, these three functions yield the point-spread width approximation 
in equation ( |1 . 1|) in the following way. We first notice from the calibration data that 
the point-spread function is roughly azimuthally symmetric, and that it does not vary 
widely with the true inclination angle. Then we can find 

PSF{9) = — / E'~''FSF{e,E')EB(E',E)dE'dE (2.6) 

iV JE=Emin JE'=0 

where E is the measured 7-ray energy, E' is the true 7-ray energy, and is a nor- 
malization factor. The deviation between the apparent incident angle and the true 
incident angle is 6. The point-spread function PSF(6') is the integral of the true- 
energy dependent point-spread function, weighted by the spectrum, integrated over 
the measured energy band from Emm to E^ax, and integrated over all true energies, 
weighted by the energy dispersion function. This reflects the fact that there is some 
probability that a 7-ray of any given true energy will have a measured energy between 

Emin and Ejyiax ■ 

This integral was done numerically for a number of energy bands, and a Gaussian 
fit to the results led to equation (pTTp ||193| . 
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2.3 Likelihood Analysis 

The sparsity of astrophysical 7-rays and the comphcated instrumental response of the 
EGRET instrument suggests statistical data analysis that functions at the photon- 
by-photon level, taking into account backgrounds and the instantaneous instrument 
state to extract the most information from each photon. Early analyses of COS B 
data were based on a cross-correlation method However, this method could not 
easily handle the highly structured background that is typical of high-energy 7-ray 
astrophysics. Later, a maximum likelihood technique was brought to bear on COS B 
data with much greater success [164|. Based on this success, maximum likelihood 



techniques were adopted for use with ECRET | |117| ]. 

The central idea of likelihood analysis is very simple. Given a set of models, we 
wish to find the model which is most likely to be responsible for the observed data. 
The likelihood is defined as the probability of the observed data, given a choice of 
model. The likelihood is written as follows: 



C{D\M) (2.7) 

where D represents the observed data, and M the model. Quantities to the right 
of the I sign are taken as given and fixed, making the likelihood a conditional prob- 
ability. The maximum likelihood method determines the best model by maximizing 
this likelihood function. A special, but very common, case is that of a parameterized 
model. For example, consider that we wish to measure the flux of a 7-ray source. We 
will imagine an idealized detector with 100% efficiency and an angular area of a sr 
from the source. The number of 7-rays emitted in a unit time is Poisson distributed, 
so the likelihood of measuring n' photons from a source with intensity /i is given by 

where n is the number of photons emitted from the source, and n' = (a/47r)n. To 
find our best estimate of /i, we maximize this likelihood. In fact, we will find it 
more convenient (and mathematically equivalent) to maximize the logarithm of the 
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likelihood. 



In C{D\fi) = —fi + nln/i + In 



(2.9) 



The last term is constant, and thus for maximization purposes can be ignored. Setting 
the derivative of equation ( p.9| ) to zero, we find 



= -1 + n/fi 



n 



-n 



a 



(2.10) 



Unsurprisingly, in this simple example, we find that the number of photons measured 
per unit time divided by the subtended angle fraction is the best estimate of the flux. 

One other example is illustrative. Let us assume we have some Gaussian process 
with a constant mean and a known variance a^. A series of observations x is made, 
and the most likely mean value /j, is desired. The likelihood function is 



£(x|/x) = ne- 



(2.11) 



Taking the logarithm, we have 



ln£(x|/.) = -l/2E^^^^ 



(2.12) 



We notice that — 21n£ is formally equivalent to x^- fact, this is an example of a 
general result: in the limit that all distributions involved are Gaussian, the maximum 
likelihood result is the same as the minimizing result. This is an example of a more 



general result known as Wilks' Theorem [pll|| ; it will be described more thoroughly 



in 52.5 



Maximum Likelihood Confidence Regions. Maximum likelihood methods also 
yield confidence regions. Following Eadie, et al. we first consider the case of a 
Gaussian distribution with unit variance and unknown mean fi. The likelihood is 



C{x\fi) = (27r)-^/2 



exp 



N 



--5](Xi -/i)2 
1=1 
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(27r)-^/^ 



exp - — {x-fj,y exp --^{xi- xf (2.13) 



i=l 



The In C is thus a parabola in fj, of the form — ^(/i — x) 



^. In the case that A?" = 1, let 



ln>C > —1/2. This corresponds to the interval — 1 < yU — a; < +1. From the properties 
of the normal distribution, we know that this must contain 68.3% of the distribution. 
Similarly, the interval corresponding to ln£ > —2 contains 95.5% of the distribution. 

This would be merely a curiosity if not for the following. Suppose that the likeli- 
hood function is a continuous function of /i, with only one maximum. In that case, 
we may reparameterize our observed variable in terms of a new variable g{fi) such 
that the likelihood as a function of g is parabolic. We may now find the confidence 
region in g as we did above. Given that region, we may invert the reparameterization 
to find a confidence interval in /j,. Furthermore, we notice that the function is the 
same, whether it is parameterized hy /i or g. That is. 



Thus, we can find the confidence region in /i directly by determining the point at 
which the ln£ has decreased by 1/2 for 68% or 2 for 95.5%, without ever finding 
the reparameterization g. However, note that the interval is central in g since the 
likelihood as a function of is a Gaussian, but it is not necessarily central in /i. 



2.4 Applying Likelihood Analysis to EGRET 



We can see now how to proceed in EGRET data analysis. Take all the data accumu- 
lated in some time period. The hkelihood of that data is the product over differential 
elements of angle and energy of the Poisson probability density of detecting the pho- 
tons, given the rate of photons times the probabihty of detecting each photon. The 
rate /i can be expressed as 



lii.C{x\iJ,) = lnC{x\jj,{g)) 



(2.14) 



fx{e,b,E;eo,bo) 



(2.15) 




E=0 



[/(£o, &o)PSF(^, 6, E'; £o, bo) + B{i, b)] SA(£, b, E')ED{E', E)dE' 
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The rate is a function of the measured energy and position on the sky. The source 
intensity / depends only on the true source position (£o,&o)- The background B is 
assumed to have aheady been convolved with the point-spread function. The three 
instrument functions depend on the instrument mode m as well; this dependence will 
be suppressed for clarity. 

The likelihood is then the product over all differential parameter elements of the 
Poisson probability of measuring (or not measuring, as was the case) a photon in 
that element, given the rate in that element from equation (|2.15| ). The appearance 



of derivatives of products encourages us to use the logarithm of the likelihood. De- 
noting the integrated rate over measured energies as /i = Je^^"^ /^(^; b, E) dE, the log 
likelihood becomes 

\nC{io,bo) = [ [ [-fi{i,tJo,bo) +n\nfL{i,b;io,bo)]didb (2.16) 

where n is the number of photons observed in a differential element didb. Since the 
element is differential, this must be either 1 or 0. The integral thus divides into an 
integral and a sum: 

TV 

ln£(4,M = - / hiii,b;io,bo)didb + Y.\nmi,,bi;io,bo) (2.17) 

where the sum is evaluated for the parameters of each photon. While this looks fairly 
simple, p, is quite a complicated object, implicitly containing four integrals. We would 
like to maximize equation ( |2.17| ) over and bo, the source position. Computationally, 
this is a herculean task, requiring evaluation of a six-dimensional integral at each trial 
point {io,bo). Clearly, some simplification is necessary. 



2.4.1 Binned Likelihood 

The most obvious simplification is to give up on individual photons, and create binned 
maps. While binning is always undesirable, as it loses information in the data, in this 
case binning is minimally undesirable. Our derivation above considered differential 
elements in the parameters; essentially, we let the bin size go to zero. It has been 
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shown that using a finite but small bin size speeds computation dramatically at very 
little expense of accuracy ||199| . 



A standard analysis program called LIKE was developed for the map-based likeli- 
hood analysis of EGRET data ||117|| . LIKE considers the total expected 7-ray rate in 



each pixel, typically 0?5 x0?5 on the sky.[| This rate is the sum of the expected rates 
from the isotropic background, the galactic background, and a point source. In order 
to estimate rates for binned maps, the photons from any desired observation interval 



are binned as in Figure p.l| . In addition, the instrument exposure to the sky must 
be calculated. This is a function of the amount of observing time and the sensitive 
area during that time. Let T{6, 0, m) be the amount of observing "livetime," (that is, 
elapsed viewing time that the instrument was active, excluding occultations and in- 
strument dead time) for a location {6, (p) on the sky in an instrument observing mode 
m between time ti and ^2- Only the total observing time in each mode is relevant; it 
need not be contiguous. Then the exposure S{AE] 9, 0) for a given measured energy 
range AE to a point {9, 0) on the sky is 

SiAE; 9, 0, ti, t2) = J2 T{e, 0, m- h)A{AE- 9, 0, m) (2.18) 

where 

_ pEmax poo 

A{AE;9,(p,m) = / E'^'"SA{E';9,(j),m)EB{E,E')dE' dE (2.19) 

J E = Emin J E' = Q 



This exposure explicitly depends on the spectral index of the source or background, 
whichever is being observed. This is the first example of a continuing difficulty; 
the spectral index is required to calculate the exposure, but it is not known until 
after the fiux is determined. In principle, the spectral index should be allowed to 
vary everywhere during the maximization process. An approximation would be to 
iterate. However, since most spectral indices are very nearly 2.0, and varying the index 



t This scale is somewhat arbitrary. It was based largely on the scale to which the Galactic 
hydrogen has been mapped, as well as to ensure sufficient photons in each bin. It was not chosen to 
optimize the likelihood method. Nevertheless, for most energies the bin size is much smaller than 
the instrument point-spread width. 
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Figure 2.3: Combined EGRET exposure in phase I, II, and III. Phase I was an 
all-sky survey, with roughly even exposure. Phases II and III consisted of pointed 
observations for sources of interest. Lighter shade corresponds to more exposure. 

requires recalculating the exposure maps (a computationally expensive prospect), a 
constant index is assumed. The exposure can be calculated for each pixel in the 



standard map; an example is shown in Figure p^S . 

To allow for the degrees of freedom in the background maps described above, the 
background is assumed to be a linear combination of isotropic and galactic back- 
ground whose coefficients will be optimized. The isotropic diffuse count estimate for 
a given bin is Qb^ij, where Qb ("g-bias") is the coefficient to be fit in units of photons 
cm~^ s~^ sr~^, and Sij is the instrument exposure to pixel i,j in units of cm^ s sr. 

The Galactic component of the rate will depend on both the diffuse map and the 
instrument point-spread function, as well as the exposure. Denoting the binned radio 
diffuse gas map as fij, the rate due to the Galactic diffuse background is 

Q . ^ fkiSkiPSFjejjM) ^2 20) 

where Oij^ki is the angular distance between pixel i,j and pixel k, I. The denominator 
is a normalization factor, necessary since the sums over k and / may not be over 
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the entire sky. Pixels far from the point of interest will contribute negligibly to the 
estimation of source flux and location, but it is critical to good flux measurements 
to have a normalized point-spread function. Therefore, we allow analysis of only the 
pixels within some radius of analysis Ranai, but renormalize the point-spread function. 
In addition, to allow for the unknown cosmic ray flux and the CO/H2 ratio, we will 
allow a constant multiplier ( "g-mult" ) to be optimized as well. 

So, then, given k sources in the field of view, the expected number of counts in 
bin i,j is 

fJ'ij = gmGij + QbSij + CfcPSF(%fc) (2.21) 

k 

where Oij^k is the angular distance between the position of source k and pixel i.j. This 
count estimate is a function of 3fc + 2 parameters: the k source strengths, latitudes, 
and longitudes; g^, and gi,. The fit values of g^ are consistently in the range 0.92-1.08. 
The fit level of g^ is usually around 2 xlO~^ photons cm~^ s~^ sr~^ p3 |. 



2.4.2 Maximizing the Likelihood with LIKE 

Just as in the exact case, the distribution of photon counts in each bin is Poisson. 
The likelihood of a given map, then, is: 



C{D\ck,ik,bk,gTn,gb) = n 



ij ^ij- 



(2.22) 



with /ijj given by equation ( |2.21| ). The maximum likelihood estimates of Ck,ik,bk, gn 
and gi, are simultaneously solving the the set of equations 



_d_ 
dck 

_d_ 

dfk 

_d_ 

dbk 
d 
dgm 



ln£(/i) 
ln£(/i) 
ln£(/i) 
ln£(/i) 










(2.23) 
(2.24) 
(2.25) 
(2.26) 
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Figure 2.4: 7-Ray intensity as measured by EGRET. The counts map in Figure 
is divided by the exposure map in Figure 0| to yield the 7-ray intensity. 



^ln£(/i) 
ogh 



(2.27) 



where fi is optimized to satisfy all these conditions. This is just the multi-dimensional 
maximization of the likelihood function — there are 3A; + 2 equations to solve simul- 
taneously. For this reason, it is usually computationally much faster (although less 
accurate) to find the brightest source first, then fix its parameters and fit the next 
source. Once the locations of the sources are established, equation ( |2.21| ) divided by 
the exposure map yields an intensity map of the sky in Figure p.4|; equivalent to the 
optical image observed simply by peering through the eyepiece of an optical telescope. 



2.5 Parameter Estimation vs. Hypothesis Testing 

The foregoing discussion concerned the estimation of the position and flux of a source 
which was assumed to exist. The maximum likelihood technique not only allows for 
parameter estimation as described above but also for hypothesis testing. Hypothesis 
testing estimates the significance of the source detection. Highly significant sources 
are reliable; low significance sources may be statistical fluctuations. 
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Consider two hypotheses, Mq and Mi. For concreteness, let us assume that Mq, 
the null hypothesis, states that there is no source in the field of view. Further, we 
assume that Mi states that there is a source with non-zero fiux in the field of view. 
If there are n degrees of freedom in Mq, and m degrees of freedom in Mi, then 



Wilks' Theorem ||211|| states that —2 times the logarithm of the likelihood ratio is 
asymptotically distributed as xfm-ny 

_ C{D\Mo) , 

C{D\Mi) ^ ' 

In the case of a source whose position is known from other experiments, there are 
3 degrees of freedom in Mi {ck,gm,gb) and 2 degrees of freedom in Mq {gm,gb)- 
Thus, the likelihood is distributed as Xi- We define the EGRET test statistic TS = 
— 2(ln£o ~ InCi). From the definition of the distribution, this implies that the 
significance of a detection is given by v^TSa in the standard nomenclature. Thus, a 
TS = 16 source is detected at the 4(T level. Of course, that is for a single trial, and in 
fact many sources observed by other telescopes have been searched for with EGRET. 

The case of previously unknown sources is a little more complicated. Mi now has 
5 degrees of freedom: Ck, ik,bk, gm, gt- But the null hypothesis is degenerate in the 
position — the likelihood is independent of position when the fiux is zero. In that case, 
Wilks' Theorem does not hold, so the distribution is not known. Some theoretical 



work has been done to determine this distribution |p.98|| . Based on Monte Carlo 
simulations, EGRET sources are accepted as detections if, for sources at Galactic 
latitudes \b\ > 10°, v^TS > 4, and for sources at Galactic latitudes |6| < 10°, v^TS > 5. 
Monte Carlo simulations have shown that for a perfect background model, roughly 
one spurious excess with TS > 16 will be detected in an analysis of the entire sky 



2.6 Bayesian Methods 

Despite their formal similarity, conceptual differences between maximum likelihood 
analysis and Bayesian methods have kept the latter relegated, for the most part, to the 
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statistical backwaters of astrophysics. Some recent work using Bayesian methods has 
yielded useful results |174| , |11(J| , since Bayesian methods will be an appropriate 
alternative framework for later chapters, a brief overview will be presented here. 

Traditional, or frequentist, statistics are predicated on the notion that extreme 
values of some function of the data given a null hypothesis indicate that the null 
hypothesis is probably wrong, and therefore the test hypothesis is probably true. 
Much of the confusion of the general population about "Statistics" is due to the two 
twists involved in frequentist analysis: first of all, the goal is to disprove the thing 
we suspect false, rather than find evidence for what we suspect true; and second, 
this is done by considering all possible outcomes from an ensemble of data sets. In 
the case of likelihood statistics, we calculate the statistic (— 21n£) and compare it 
to the distribution expected from the ensemble of data sets that might be generated 
if the null hypothesis were true. If the measured value of the likelihood is extreme, 
according to this distribution, we claim that the null hypothesis has been excluded 
to some confidence level. 

2.6.1 Bayes' Theorem 

In contrast, the Bayesian method demands that we stay at all times in the realm of 
probability: specifically, the probability that the test hypothesis is true. To develop 
the mathematics, let us begin with what Scargle [ p.74|| calls the "obviously true" form 
of Bayes' Theorem: 

P{M\D)P{D) = P{D\M)P{M) (2.29) 

Formally, this follows from the definition of conditional probabilities. The probability 
of some statement M being true, given that another statement D is true is the joint 
probability of M and D. Any joint probability may be expressed as the probability of 
one statement times the conditional probability of the other. Or stated another way, 
the probability of A and B equals the probability of B and A. Thus Bayes' Theorem 
is proved. 

Clearly, the notation in equation (|2.29|) is suggestive. As we have identified above, 
the probability of the data, given a model, is known as the likelihood of the data. 
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Equation (|2.29|) points out that while we have been calculating hkelihoods, what we 
really desire is P{M\D); that is, given the data that we have, what is the probability 
that a given model is true? This is the question which Bayesian methods set out to 
answer. To that end, we rearrange equation ( p.29|) into a more useful form. 

C(D\M)P(M) 
P{M\D) = ^ ^ (2.30) 

Equation dOOp gives us exactly what we want: the probability of a model being 
true, given the data that we have observed. We do not resort to any hypothetical 
ensembles of data, and more importantly, we make direct statements about the model 
in question. 

Analogously to our likelihood calculations in § |2.3| , the model in question may be 
one of a discrete series, or it may be parameterized. In the case that the model is 
a function of a continuous parameter, then the left side of equation (|2.30|) , known 
as the joosienor probability, becomes a function P{M{6)\D). It is interpreted as the 
probability that the true value of the parameter ^0 is given by 6. 

Given the importance of the posterior probability, it behooves us to understand 
the right side of equation (|2.3(]|) . The likelihood has been fully discussed. P(M) 
is known as the prior probability of the model. Except (apparently) in the case of 
quantum mechanics, probabilities are used to represent ignorance. We do not know 
the true value of a parameter, or which model is correct, so we assign probabilities 
to represent the knowledge that we do have. P{M) represents the knowledge we 
have about the system before we receive the data D. A common example is that 
of a parameterized model in which we know that the true value of the parameter 
6 lies somewhere between 6min and ^maxQ- The prior is then flat over the interval 
[0min,0max] ^^d zcro clscwhere. In other situations, it may be more appropriate to 
take a scale-invariant prior, which would have a logarithmic form. Such a prior is 
generally referred to as a "least informative prior." It reflects only information about 
the structure of the experiment, and does not favor any specific outcome. Of course. 



■It will often be possible to take the limit of our results as Omin ^ —00 and 9 



max 
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if there is specific information about the true parameter value, the prior should reflect 
that information. 

Finally, we must address the denominator of equation (|2.3CI| ). This term, the 



probability of the data, serves as a normalization. It expresses the probability of 
measuring the data regardless of which model is actually true. If this can be rigorously 
calculated, then the left side of equation ( |2.3CI| ) will be a well-normalized probability 
distribution. In practice, it is often more practical to sidestep the issue by forming 
the odds ratio. 

2.6.2 The Odds Ratio 

Usually it is the case that we compare two discrete models, or that we compare a 
discrete model with a class of models characterized by a finite number of continuous 
parameters. In that case, the odds ratio conveniently handles our normalization 
issues. The odds ratio is formed by comparing the posterior probabilities of the 
different models. Consider two discrete models, Mq and Mi. For concreteness and 



comparison with § p.5| , let us assume that Mq states that there is no source, and that 
Ml states that there is a source of a given flux at a given position. Then the odds 
ratio is 

P{Mo\D) ^ C{D\Mo)P{Mo) 

P{Mi\D) C{D\Mi)P{Mi) ^ ■ ' 

This gives the probability that there is no source relative to the probability that 
there is a source. Note that it only allows those two possibilities. A value of 1/3, 
for example, would mean it was three times more likely that there was a source than 
that there was no source. We know nothing about the probability that there were 
two sources. 

Of course, the example in § p.5| was more complicated than this. The model Mi = 
Mi(/x) was a class of models, parameterized by the source strength. The odds ratio 
in that case is also a function of the parameter: 

_ P(Mi(^)|D)P(Mi(^)) 
^^^^ - P(Mo|D)P(Mo) ^^-^^^ 
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The function 0{fi) gives the odds that there is a source of strength /i at a given 
position versus that there is no source. In most cases, we will be interested in the 
total odds ratio; that is, the odds that there is a source regardless of its strength. 
This is akin to the detection significance calculated in ^2.5| . We may calculate this 
ratio using the procedure known as marginalization. 



2.6.3 Marginalization and Confidence Regions 

We may eliminate uninteresting parameters by marginalization. This process acquired 
its odd name through a historical accident, when "integrations" were carried out 
numerically by adding columns of numbers into the margins of the page. In essence, 
the method is mathematically simple. Given any conditional probability, and the 
probability of the condition, we may integrate to eliminate the condition: 

P{A) = / P{A\B)P{B)dB (2.33) 

The application to the odds ratio is immediately clear. We simply integrate the 
numerator over all possible /i to find the total odds ratio. 

A very similar process may be used to find confidence intervals. Consider the 
situation when a source is known to exist. We then wish to find the best estimate of 
its flux, and a confidence interval for that estimate. Since the flux /i must take on 
some positive value, we may evaluate the denominator of equation ( p.30|) : 

P{D)= C{D\M{fi))P{fi)dfi (2.34) 
Jo 

Then the probability that the true value of /i is between /x_ and /i+ is 

/o C{D\M{fi))P{fi)d^ 

For a given confidence level, there are an infinite number of choices of confidence inter- 
vals, as there are with frequentist statistics. There is no requirement that confidence 
intervals be contiguous. However, useful intervals can often be found by requiring 
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/i_ = fi — 6fi and fi+ = fi + 6fi, where fi is the most hkely value of /i |3^. The 
resulting interval is central by definition, and in the limit that the probability density 
is symmetric about the maximum and is smaller everywhere outside the interval than 
it is inside the interval, it is minimal. 



2.6.4 Advantages and Disadvantages 

Advocates of frequentist and Bayesian methods have unfortunately been polarized 
into two extreme camps, with only the most acrimonious communication between 
them. In fact, both methods are rather like a powerful hunting rifie. Used properly, 
they are efficient and successful at doing their job. Improper use may result in 
permanent catastrophic injury and/or death. We will briefiy examine the objections 
to both methods, and in doing so find that the disagreements are actually objections 
to using the methods improperly. 

The major objection to the Bayesian method is the use of a prior. It is said that 
the prior subjectivizes what should be an objective procedure, and therefore reduces 
the results to a sort of "modified best guess" of the experimenter. The Bayesian 
responds that that is exactly true; indeed, if we use probabilities to express our 
ignorance, then we should hope that the results of an experiment reduce our ignorance. 
Furthermore, the Bayesian claims that all assumptions and prior knowledge are made 
explicit under the Bayesian formulation. It is certainly clear that the prior is the 
Achilles' heel of the Bayesian method. There is no objective, prescribed method for 
obtaining a prior. However, there is also no objective, prescribed method for choosing 
a traditional statistic. The use of x^, for example, implicitly assumes that the errors 
in the measurements are Gaussian, which may or may not be the case. 

The primary objection to traditional statistics is the ad hoc procedure of selecting 
a statistic. Statistics are chosen based on their power and appropriateness, to be 
sure, but the choice is also largely guided by experience. The justification for the 
choice of statistics is generally its past success, rather than any a priori reasoning. In 
contrast, the Bayesian method prescribes the calculation for any experiment: form 
the likelihood and weight by the prior. 
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A secondary objection is philosophical in nature. Bayesians prefer to treat the 
true value of the parameter as a random variable, with the experimental data as the 
fixed and unchanging measure of reality. The parameter then has some probability of 
falling within the confidence region. The traditional approach treats the true value 
of the parameter as fixed and unchanging, and the data as only one possible outcome 
in an imagined ensemble, a shadow or projection of the true reality. The confidence 
region that we calculate from this instantiation of the data then has some probability 
of covering the true value of the parameter. f\ 

Despite these objections, the two methods are actually compatible, and under 
the right circumstances, equivalent. In most circumstances, the maximum likelihood 
method is equivalent (for parameter estimation) to the Bayesian result with a flat 
prior. Whenever the implicit assumptions of a traditional analysis are matched by 
the explicit assumptions of a Bayesian analysis, the results will be the same. 



2.7 Calculating Upper Limits 

A significant positive detection of a source is the goal of all telescopes. However, null 



results can also be useful ||133| , P0CI|| . It is often valuable to set an upper limit on the 7- 
ray flux of a source known from X-ray, optical, or radio observations. Determining the 
value of the upper limit is a statistical endeavor that requires a very careful definition 
of the goal. It has been noted that "the question of how to calculate an upper limit in 
the vicinity of a physical boundary is one of the most divisive in high-energy physics" 
0. Unfortunately, this is precisely the limit in which we find ourselves. A negative 
source flux is unphysical; nevertheless, a measurement of a weak (or non-existent) 
source in the presence of a large background may easily result in a data set best fit 
with a negative source flux. In order to combine such a result with other results. 



the flux must be reported as negative, with the confidence region found as in §2.3 



^The philosopher will note a certain correspondence to historically important epistemological 
viewpoints. Sartre may debate Plato on the true nature of reality, but both are crushed by the rock 
of Sisyphus. 
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under the maximum likelihood method [^]. Otherwise, any combination of results 
from different observations or different experiments would be biased. 

Once we have agreed upon a point estimate of the parameter, we must consider 
the confidence region that we wish to comprise the upper limit. In analogy to the 
confidence regions around point estimates, we will define the "la upper limit" (in 
frequentist terms) to be the top of an interval which will, when constructed from an 
ensemble of data sets, contain the true value of the flux 68.3% of the time. In Bayesian 
parlance, this means that the integral of the posterior probability distribution from 
zero to the la upper limit will be 68.3%. 

2.7.1 Upper Limits from LIKE 

The upper limits generated by LIKE do not fulfill this definition. There are three 
situations for which LIKE must generate an upper limit. The first is when the flux 
measurement is positive, but the confidence region extends to negative flux; e.g., 
a measurement of 5 ± 10. The second is when the flux measurement is negative, 
but the confidence region extends into positive territory. The third is when the flux 
measurement and the entire confidence interval are negative. 

LIKE handles these situations in the following way. An upper limit is always 
quoted, and the flux measurement is quoted only if certain conditions are met. If 
TS > 1 and the flux is positive, the flux and confidence regions are quoted. The upper 
limit is the top end of the confidence interval. If TS < 1 or the flux is negative, only 
an upper limit is quoted. (Note that this immediately introduces the bias discussed 
above.) If the flux is positive, the upper limit is the top end of the confidence interval. 
If the flux is negative, LIKE finds the width of the confidence region, then shifts it 
so that it is centered on zero. The upper limit is then the top end of the shifted 
confidence interval. 

The upper limit calculated for strong sources with positive flux clearly does not 
fulfill our definition. The confidence interval on the point estimate will contain the 
true value 68.3% of the time. The interval between zero and the upper limit is larger 
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than the confidence interval, and so must contain the true value at least 68.3% of the 
time. 

The upper limit for sources with positive flux whose confldence intervals extend 
to negative fluxes also does not fulflU our deflnition. The 68.3% upper limit is found 
from the one-sided confidence interval; that is, the value au.i. such that 



where f{fi\no) is the probability of measuring fi given a true flux of /io- In the case of 
a Gaussian function /, the confldence of the integral evaluated to cr„.;. = jl + a may 
be evaluated. In general, the confldence of an interval extending to an upper limit of 
fi + a depends strongly on the shape of /. 

The most difficult situation is when the maximum likelihood source strength is 
negative. Barrett, et al. suggest "lifting up" the measured flux to zero, evaluating 
the likelihood function and taking the upper limit. Instead, LIKE flnds the width of the 
confldence interval about the measured (negative) flux, then centers that confldence 
interval about zero. It is very difficult to estimate the probability that a confidence 
interval obtained in this way would cover the true source strength; one must consider, 
for each possible value of the true source strength, all possible data sets. Any one of 
those data sets could fall into any of the categories we have outlined. 

2.7.2 Calculating More Accurate Upper Limits 

The upper limits calculated in this way are not only confusing and non-intuitive, they 
also do not (in general) fulfill our requirement: a confidence interval from zero to the 
upper limit should cover the true value 68.3% (or some other specified fraction) of the 
time. A Bayesian approach is in this case more intuitive and fulfills our requirements. 

Instead of considering the ensemble of data sets that might produce a given fiux 
measurement, we start directly with the data, and consider the range of true fiux 
values that might produce the data. The requirement that all fiuxes must be positive 
is easily fulfilled; we form a prior that is fiat for /i > and zero for negative /x. 
Our posterior probability distribution is identical to the likelihood, except that it is 




(2.36) 



CHAPTER 2. STATISTICAL METHODS IN 7-RAy ASTRONOMY 



34 











/ 0,1 




/ 0.05 




, , , , poo 


1 , , , 1 



-4-2 2 



Figure 2.5: Upper limits calculated by Bayesian methods. This hypothetical like- 
lihood function represents a measurement that gives a most likely value of some 
parameter to be less than zero. If the parameter is constrained by the physics to 
be positive, the Bayesian formalism suggests forming a prior which is zero for neg- 
ative parameters values, and constant for positive parameter values. The posterior 
distribution would then look like the shaded portion of the curve. 

zero for all negative values of fi, and renormalized so that the integral over all values 
is unity. (Figure |2.5|) . The interpretation is then straightforward. The most likely 
value of the flux is zero; the 68.3% upper limit is found by integrating the posterior 
distribution from zero to some value cr„,/, where the integral equals 68.3%. 



2.7.3 Implications for Extant Conclusions 

There are two sorts of conclusions typically drawn from EGRET upper limits. The 
first is generally qualitative. A known X-ray or radio source is not detected in the 
EGRET data. A "low" upper limit is taken as evidence that 7-ray emission is small 
or non-existent. A "high" upper limit is taken as weak evidence that there is little 
7-ray emission, but that there was not sufficient data to draw a conclusion. For these 
sorts of conclusions, the LIKE upper limits are adequate (e.g., ||109| , |77|). 

The second sort of conclusion is generally quantitative and statistical in nature. 
Often, possible source variability is examined in a number of observations. Some 
EGRET observations of a source (usually an AGN) yield significant detections, and 
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some yield only upper limits. Given flux measurements and confidence regions, it is 
a simple matter to formulate a or likelihood test to determine if a constant flux 
model is compatible with the data. It is critical for such a test that upper limits have 
well-defined statistical properties. The upper limits generated by LIKE and quoted 
in the EGRET catalogs do not have these properties ||195| , |196| , |65| . Unfortunately, 
catalogs of variability have been compiled based precisely on these upper limits ||121|| . 
All variability conclusions which involve upper limits are suspect, and should be 
treated as qualitative suggestions rather than quantitative results. Work by W. F. 
Tompkins is in progress to compile variability catalogs which have been calculated 
with statistically meaningful upper limits ||199|] ; these should be used as soon as they 
are available. 



Chapter 3 
7- Ray Bursts 



In the midst of the Cold War, in the late 1960s, a number of satellites were launched 
carrying 7-ray detectors. Sensitive to 7-rays between ~200 keV and ~1.4 MeV, the 
Vela satellites were designed to detect the testing or use of nuclear weapons. Between 
July of 1969 and July of 1972, four Vela satellites, equally spaced in the same circular 
orbit, detected 16 bursts of 7-ray energy. Comparisons of the 7-ray arrival times 
in different satellites determined that the origin of the bursts was more than 10 
orbital diameters away. Thus, the first theory of 7-ray bursts, Soviet nuclear testing, 
was ruled out. Although national security concerns delayed the publication of these 



intriguing results by Klebesadel, Strong, and Olson |^ for a number of years, they 

would mark the beginning of the longest- standing mystery in astrophysics since the 

Shapley-Curtis debates. 

A number of other bursts were observed over the next two decades | [L29| , ^08| ]. A 

consensus emerged fairly early that the source of the mysterious 7-ray bursts was 

Galactic in origin [|53|.Q Preliminary detections of flux over 1 MeV strengthened this 

conclusion ||149|| ; Schmidt "showed" that detection of emission over 1 MeV required a 

Galactic origin, since the source luminosity required for bursts at more than ~100 pc 

would imply an energy density that would result in 7-7 pair production | |175| — 

the optical depth to Earth for 1 MeV 7-rays would be >1. While there was never 

*This work will deal only with the so-called "classical 7-ray bursts." Another class of transient 
7-ray sources, the soft gamma repeaters, are clearly a different type of object. 
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significant evidence as expected that the bursts were preferentially located in the 
Galactic disk, it was presumed that the BATSE experiment on board CGRO would 
clear up any remaining ignorance about 7-ray bursts. 

Despite the general consensus on the location of 7-ray bursts, the field attracted 
a wide variety of specific theories. Nemiroff identifies 99 distinct 7-ray burst theories 



put forth between 1968 and the end of 1991 | [L48|| . Early theories placed bursts 
both locally and cosmologically distant, with energy generation mechanisms ranging 
from relativistic dust grains in the solar neighborhood to cometary collisions with 
neutron stars to white hole emission to vibrations of cosmic strings at z = 1000 [|. 
Nevertheless, by 1981, most models involved neutron stars in the Galactic plane. 
(Note, however, that Paczyhski [ |157| | put forth a theory of cosmological bursts with 
an optically thick e~e"*" plasma outflow in 1986. However, Paczyhski also put forth 
a number of other unrelated theories of 7-ray bursts; thus it is unclear whether his 
apparent success is due to prescience or judicious covering of theory space.) 

In September 1991, Gerald Fishman, representing the BATSE team at the first 
Compton Symposium, summarized the results of the 7-ray burst search since the 
launch of CGRO in April of that year. Everything that had been believed about the 
origin of bursts was shaken. The data, published first by Charles Meegan, Fishman, 
and others in Nature ||123|| , showed an isotropic distribution of 7-ray bursts across 
the sky. Furthermore, there was already evidence of the so-called "edge" of the 
7-ray burst distribution. That is, there were fewer low-flux bursts than would be 
expected if bursts were standard candles uniformly distributed in Euclidean space. 
The distribution was incompatible with a galactic disk population. 

Old theories die hard, and much effort went into searches for the expected spatial 
and temporal correlations between bursts in the BATSE catalogs. While it appeared 
in the first BATSE catalog that some correlations might exist, additional bursts in 
the second catalog made isotropy much more likely Such statistical tests must 
be performed very carefully, due to the biases acquired from variable thresholds and 
exposures ||160| , p,62| , |107|| . Similarly, only very marginal evidence could be found 
for repeating 7-ray bursts | |161| |. The distribution of all bursts observed to date 
(Figure |3]5) is compatible with a uniform distribution across the sky ||125| . 
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Figure 3.1: Locations of 7-ray bursts observed by BATSE as of July 3, 1998, in 
Galactic coordinates [ |125| . 



The BATSE data quickly winnowed away many 7-ray burst theories. But like 
opportunistic species after an ecological disaster, new theories quickly sprung up to 
fill the void (e.g., ||186|| ). The new theories were classified primarily by the location of 



the bursts: locally (in the Oort cloud), in the hypothesized large Galactic halo, or at 
cosmological distances. The first of these was essentially abandoned when no suitable 
energy generation mechanisms could be envisioned. |] Halo models generally involved 
interactions with old neutron stars that had been ejected from the plane into the halo. 
While the presence of a dark Galactic halo has been inferred from the rotation curves 
of other galaxies, its extent has not been conclusively measured . The isotropy of 
the 7-ray burst distribution suggested that the radius of the halo would need to be 
many times the distance between the Earth and the Galactic center. Similarly, the 
halo of M31 (Andromeda) would be evident in the distribution once BATSE detected 
a large enough sample of bursts. The isotropy of the first 1122 7-ray bursts detected 
by BATSE [ |124[ suggests if the bursts were in the Galactic halo, the halo would have 



a radius greater than 100 kpc. To be consistent with the lack of any excess toward 
Andromeda, the scale would have to be less than 400 kpc; a rather tight constraint. 
If the bursts were at cosmological distances, evolution effects conveniently explain the 



^C. D. Dcrmer proposed a mechanism at a conference in Santa Barbara, California in 1995 
involving antimatter comets annihilating with normal comets in the Oort cloud. This theory was 
apparently never published. 
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observed "edge" of the distribution. The energy required to produce the observed flux 
from such distances (~10^^-10^^ ergs, modulo any beaming factor) is a few percent 
of the binding energy of a neutron star. Such an energy scale leads naturally to 
catastrophic theories involving neutron stars. 

It had been realized quite early ||175|| that the large energy release required for 
distant 7-ray bursts would produce an environment that was optically thick to 7-7 
pair production. An assumption of cosmological origin of the 7-ray bursts led to a 
bifurcation of theories into those describing the energy generation mechanism and 
those describing the propagation of a large amount of compact energy, regardless 
of its source. The relevance of the latter theories for EGRET observation will be 
discussed in ^K^- 



3.1 Recent Observations 

In 1981, Fichtel & Trombka speculated that "identification of the [7-ray burst] objects 
with observations at other wavelengths will probably be required before significant 
progress can be made in determining their origin" |^ . The day for which they were 



waiting arrived February 28, 1997, when the Italian satellite BeppoSAX discovered an 
X-ray afterglow to a 7-ray burst pO| , pl[| . The BeppoSAX satellite has a collection of 



instruments, including a 7-ray burst monitor, a wide-field imaging X-ray camera with 
a positional resolution of about 3 arcminutes, and a narrow-field X-ray camera with 
a position resolution of about 1 arcminute. Within 8 hours of the detection of the 
burst and its imaging with the wide-field camera, the narrow-field camera observed a 
source, consistent with the burst error circle, at 02000 = 05'^01™57'^, ^2000 = 11°46'42" 
Optical and radio transients at the same position were quickly found 



Some time later the optical transient was also seen by the Hubble Space Telescope 



173|| . The optical, radio, and X-ray sources were observed to decay as a power law 
consistent with the fireball blast wave theories of Meszaros and Rees ||127|| and Wijers 
et al. ^|. 



In the following months several other bursts would be found to be consistent with 
transient sources in other wavelengths. Another revelation came from GRB 970508, 
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observed May 8, 1997 by the wide-field camera on BeppoSAX and by its narrow-field 
camera 5.7 hours later. An optical counterpart was quickly found by Bond and a 
radio counterpart found by Frail et al. consistent with the narrow-field BeppoSAX 
position. Metzger and his colleagues at Caltech identified a set of absorption features 
implying a redshift of > 0.835 ||132|| . By early June, they detected emission lines as 
well, at the same redshift ||130|| . For the first time since their discovery 30 years before, 
there was direct experimental evidence that 7-ray bursts were cosmological | ]131 |. 

Another notable burst, GRB 971214 was detected at a number of different 
wavelengths: optical ^9\, infrared [55|, and ultraviolet [IC], as well as being de- 
tected by a number of 7-ray and X-ray instruments including BATSE, RXTE, and 
Ulysses Absorption and emission lines have been detected from this burst as 

well, yielding a redshift measurement of 3.43 — firmly establishing the cosmological 



nature of the 7-ray bursts [|101|| . Most recently, the redshift of the host galaxy of 
GRB 980703 has been measured at the Keck Observatory. Absorption and emis- 
sion lines yield a redshift of z = 0.966 [Q. A good review of the current state of 
observational affairs is given by Wijers ||209 . 



Now that the observational mechanism for establishing 7-ray burst counterparts 
has been established, the field is changing rapidly. As of this writing, 11 7-ray bursts 



have been observed in radio and optical wavelengths ||108|| . These observations and 
others which will certainly be made in the near future have strongly affected, and will 
continue to affect, the leading theories of 7-ray bursts. 



3.2 7-Ray Burst Models 

Until the launch of BATSE, 7-ray burst theories outnumbered the bursts themselves. 
In light of the data provided by BATSE, BeppoSAX , and other satellites, it is worth- 
while to examine the leading theories with an eye to understanding the consequences 
observable by EGRET. We will first look at the theories of the energy source power- 
ing 7-ray bursts, and then examine some of the theories of the 7-ray generating shock 
waves created by the bursting source. 
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3.2.1 Energy production mechanisms 

Gigantic explosions hold fascination for the theoretical physicist as much as for the 
schoolchild. Cosmological 7-ray bursts require the largest explosions known, and 
thus attract theorists in droves (e.g., [0). As long ago as 1975, Malvin Ruderman 



noted ||172|| , "For theorists who may wish to enter this broad and growing field, I 
should point out that there are a considerable number of combinations, for example, 
comets of antimatter falling onto white holes, not yet claimed." Mercifully, the con- 
siderable experimental data amassed since 1975 has largely narrowed the field to two 
general mechanisms: neutron star-neutron star mergers and massive stellar collapses 
known as "collapsars" or "hypernovae." Nevertheless, at this stage in our understand- 
ing of 7-ray bursts it is unreasonable to think that all theories could be placed in one 
of only two categories; these represent only the most prominent of the theories. 

Neutron star mergers. The most popular theory at the time of this writing is 



the neutron star merger model ||145|1 . The basic premise is very simple. Neutron star 
binary systems slowly lose energy as a result of gravitational radiation. Eventually, 
the neutron stars will spiral into each other, presumably resulting in a large release 
of energy. A variant of the model replaces one of the neutron stars with a black 
hole. Narayan | |145|| cites several advantages of such a model, as well as a number of 



issues to be resolved. First, the source population is known to exist. Neutron star 
binaries have been observed, and their energy loss corresponds with that expected 



from gravitational radiation to better than 1% ||188|| . Second, the energetics are of the 
right order. The energy release from such a merger would exceed 10^^ ergs in ~1 ms 
within 100 km. Finally, the frequency of such mergers may be approximately right. 
Estimates of the merger rate include the range from 10^^ to 10^^ yr~^ per standard 
galaxy, which matches the observed burst rate under a cosmological scenario and 
isotropic emission from the energy source. 

An early objection to the neutron star merger model was the so-called "no host 
galaxy" problem. Optical observations of 7-ray burst error circles before the launch 
of BeppoSAX failed to find the galaxies expected as the hosts of the colliding neutron 
stars. However, it was realized that binaries can often acquire substantial recoil 
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velocities as a result of their two supernova explosions. For reasonable parameters, 
binaries may travel 1-1000 kpc before merging ||201|| . Therefore, by the time a binary 



neutron star system becomes a 7-ray burst, it may have left its progenitor galaxy. 
The diverse character of observed 7-ray bursts has been pointed out as a problem 
for the merger model. Neutron stars are expected to have a very narrow mass range. 
Merging binaries should almost always consist of about 3 Mq collapsing in a very clean 
gravitational system. Such a homogeneous energy generation mechanism should result 
in a fairly homogenous population of 7-ray bursts, though beaming effects, magnetic 
fields, and an inhomogenous environment may explain the observed differences. 



Hypernovae. A much newer idea is the hypernova model due to Paczyhski [158|. 



The idea is similar to Woosley's "failed supernova" model ||214|| . A very massive star 
undergoes core collapse to form a ~10 Mq black hole. If the star is rapidly rotating, 
then the angular momentum of the star requires the formation of a rotating dense 
torus of material around the Kerr black hole. Any previously existing magnetic field 
will be significantly affected by the collapse; most possibilities strengthen the local 
magnetic field. Paczyhski estimates that the rotational energy of the black hole should 
be of order 10^"^ ergs. He also finds that the maximum rate of energy extraction is 

.„„«10»Ws-(^)^(^)' ,3.1) 

Achieving the required energy release requires fields on the order of 10^^ G. No mech- 
anism for generating such fields is offered, although many theorists are currently 
working on the details of the effects of the core collapse on the ambient magnetic 
fields. 

The observed spectrum from either model of 7-ray bursts depends more on the 
details of the fireball model discussed in the next section than on the energy genera- 
tion mechanism. The primary difference between these mechanisms is their location. 
While neutron star mergers are often expected far from galaxies, hypernova are found 
in star-forming regions, since their progenitors are rapidly evolving massive stars. The 
lack of optical observation of GRB 970828, despite fairly deep searches, combined with 
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the reported large hydrogen column density | 142 | has led Paczyhski to conclude that 
the optical emission was not observed due to extinction by dust. 

A substantial number of detections of 7-ray bursts at multiple wavelengths should 
illuminate this question in the very near future. 

3.2.2 Blast wave theories 

While very early experiments seemed to detect thermal radiation from 7-ray bursts, 
better instruments soon observed a characteristic power law emission in the X-ray 
and 7-ray regime. Theoretical explanations have centered on the blast-wave model, 
where the injection of a large amount of energy into a very small volume leads to 
an expanding fireball, optically thick to pair production, expanding into some sur- 
rounding medium. The simplest model — also due to Paczyhski | |157| | — assumes the 



spherical expansion of an optically thick e~e+ plasma with no surrounding medium. 
This model results in a blackbody spectrum. A series of more and more compli- 
cated models were based on this simplistic one, finally culminating in the model of 
Meszaros, Rees, and Papathanassiou |p.28|| , which calculated spectra expected for a 
variety of magnetic field configurations and particle acceleration efficiencies, including 
the shock fronts and reverse shocks which arise from the expansion of the burst ejecta 
into the surrounding medium. 

Shock fronts in blast waves have become the baseline from which various energy 
generation mechanisms diverge. Their popularity stems from their success: they can 
naturally explain how energy can be reconverted into 7-rays from the particles that 
must emerge from the fireball, they can naturally explain the observed burst time 
scales, and, assuming the medium into which the fireball expands varies slightly from 
burst to burst, they can explain the wide variety of burst profiles and time scales 
observed [165, 204 1. The observed afterglows may be explained as emission from 



the slowing, cooling shock |p05 , 203 |. However, several theoretical issues remain. 



The mechanism of particle acceleration in relativistic shocks is not well understood, 
though it is widely assumed that such acceleration results in a power law energy 
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spectrum 0. The magnitude and nature of the magnetic field is unknown. Pre- 
existing magnetic fields, if sufficiently strong, will provide significant synchrotron 
radiation in the particle blast wave. In addition, turbulent mixing in the shock can 
generate significant fields due to charge separation. Finally, the coupling between 
electrons, baryons, and the magnetic field is not well understood. Energy radiation 
from elections is very efficient, but most of the energy is in the baryons, or perhaps 
the magnetic field, which do not radiate their energy so readily. The nature of their 
coupling affects the total radiated power as well as the time scale of the radiation 



168 



While a detailed investigation of the various blast-wave models is beyond the 
scope of this work, a general overview is given, based on that found in Chiang & 
Dermer [^. The burst begins with the deposition of 10^^-10^^ ergs of energy in 
a radius of about 100 km. The nature of this energy is not known. It is assumed 
that most of this energy will be transformed into kinetic energy of baryons, which 
expand adiabatically. The baryons soon become cold; that is, the baryon speeds in 
the comoving frame of the bulk fiow become sub-relativistic. The bulk Lorentz factor 
is then given by Fq — Eq/Mqc^ where Mq is the rest mass of baryons. 

As this sphere freely expands into the surrounding medium, it sweeps up material. 
A shock front begins to form. At some "deceleration radius" the shell can no longer 
be approximated as freely expanding, and the bulk kinetic energy of expansion begins 
to be reconverted into internal energy of the baryons. This radius is the point at which 
the integrated momentum impulse of the swept-up matter equals the original baryonic 
rest mass: rj^ ~ (3/47rpFo)Mo where p is the density of the surrounding material. 

Predictions of what happens at this deceleration radius form the core of most 
blast-wave models. A function F(r) is derived which expresses essentially the rate at 
which baryonic kinetic energy is converted to radiation. Since cosmological scenarios 
require high initial Fq of order 10^-10^, much of the energy release in all scenarios is 
compressed into the first few tens of seconds in the observer's frame. Nevertheless, 
models of the blast wave at the deceleration radius and beyond make predictions 
about the observed spectrum at various times as well as the observed burst time 
scales. 
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Given the recent evidence that 7-ray bursts really are cosmological in origin, the 
energy required for the fluences observed on Earth make some sort of fireball almost 
inevitable. Blast-wave models have qualitatively reproduced some aspects of 7-ray 
burst observations, and appear to be a promising theoretical road to pursue. However, 
it should be noted that blast waves as they have been described cannot be the whole 
story. In either the neutron star merger model or the hypernova model, there is 
a natural symmetry axis associated with the source. A symmetric blast wave then 
seems rather unlikely. Asymmetry in the blast wave will lead to beaming on some 
scale. Beaming will of course reduce the energy required per burst, but increase the 
rate at which the bursts must occur. It is not clear how beaming would affect the 
radiation produced by shock waves. The radiation reaching the Earth has already 
been beamed into a cone of order l/Fg. This tight beaming means that all of the 
7-rays observed at the Earth are emitted from a very small piece of the blast wave; 
therefore, the amount of large-scale symmetry in the wave may be irrelevant to the 
spectrum observed. 

3.2.3 Observable consequences 

Without the compass of observations, we are doomed to drown in an ever-deepening 
sea of theories. Fortunately, the tools of the experimental astrophysicist are becoming 
more and more powerful. A number of X-ray satellites are currently operating, with 
more planned, which produce error boxes small enough to allow radio astronomers to 
bring their substantial instruments to bear. GLAST will revolutionize high-energy 
7-ray astronomy in the next decade just as EGRET did in this decade. And new 
types of astronomy, previously relegated to science fiction or crackpot dreams, are 
beginning to provide useful data. Air Cerenkov detectors can localize high-energy 7- 
rays from a few hundred GeV up to many TeV to much less than a degree. Bottcher 
& Dcrmcr calculate the high-energy emission from proton synchrotron radiation and 
photopion-induced pair cascades, and find that future high-sensitivity Cerenkov tele- 
scopes with low energy cutoffs (or, it should be noted, good very-high energy response 
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from GLAST; see Chapter |^) could measure the level of the infrared background ra- 
diation [|TB|, since high-energy 7-rays will interact with the infrared background to 



produce electron-positron pairs. While no current neutrino detectors have sufficient 
sensitivity, a measurement of the neutrino flux from a 7-ray burst could shed light 
on the energy generation mechanism ||206|| . Additionally, an intense gravitational col- 



lapse will produce gravitational waves, which could be detectable with the detectors 
currently under construction |^ . 



3.3 EGRET observations 

While BATSE has enjoyed the spotlight for most of the contributions of CGRO to 
the 7-ray burst problem, EGRET has made some important observations as well. 
At least 16 bursts occurred outside the field of view of the spark chamber, but were 



nevertheless detected in the calorimeter |22, |103|, 177 1. Five 7-ray bursts have been 



detected in the EGRET spark chamber Each has characteristics worthy of some 
examination. 

The first burst detected in the EGRET spark chamber occurred on May 3, 1991. 
Six photons were detected in the spark chamber in two seconds, coincident with the 
signal received from BATSE ||1 76|| . Measurements of the background before the burst 
suggested that 0.18 photons per two seconds were expected in the spark chamber. 
Additionally, the burst was evident in the TASC calorimeter data, as well as the 
anticoincidence trigger rate. The anticoincidence dome is sensitive to photons above 
about 20 keV via Compton scattering, while the TASC has four triggering levels, 
corresponding approximately to energies above 1.0 MeV, 2.5 MeV, 7.0 MeV, and 
20.0 MeV. Firm detection of this event with all three systems, coincident with the 
BATSE trigger not only demonstrated 7-ray burst detection with EGRET but also 
measured the first 7-ray burst emission above 100 MeV: one of the spark chamber 
photons had an energy of approximately 230 MeV. 

The next burst was detected a month later, on June 1, 1991 | |103|| . This burst was 
also seen in the TASC and anticoincidence dome, and four photons were detected in 
the spark chamber when 1.5 background photons were expected. The detection of 
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high-energy photons from these bursts began to shift the evidence toward power-law 
emission instead of thermal emission. Furthermore, the last photon to arrive in the 
spark chamber was nearly 70 seconds after the initial BATSE trigger. 

Buffalo Bills quarterback Jim Kelly remembers January 31, 1993 as the day the 
Bills lost their third consecutive Super Bowl, this time to the Dallas Cowboys by 
a score of 52-17. High-energy astrophysicists worldwide remember that date for 
the detection of what still stands as the most intense BATSE burst ever seen | ]100 |. 
BATSE count rates exceeded 2x10^ counts s~^. Significant structure to the burst was 
observed at scales below 10 ms, and the first intense spike lasted 200 ms. Fortunately, 
the burst was in the field of view of the EGRET spark chamber, which detected 16 
7-rays (compared to 0.04 expected by chance) in 25 seconds | |181| |. Two of the 7-rays 
had energies of nearly 1 GeV. EGRET measured a power-law photon spectrum with 
index ~— 2. The hard photon spectrum lent further credence to shock acceleration 
models, although the high-energy photons were taken as evidence that the burst was 
closer than ~50 pc. A total fluence measurement could not be made by EGRET, 
since the dead time per spark chamber event is approximately 100 ms. It is likely 
that a number of high-energy 7-rays passed though the spark chamber in the first 
100 ms. 

It would be more than a year before EGRET would detect another burst, but 
that burst would provide plenty of fuel for the raging 7-ray burst debate. The burst, 
which arrived on February 17, 1994 and whose initial pulse lasted for 180 s, was 
detected by COMPTEL |9|, Ulysses, BATSE, and EGRET. Of primary theoretical 
importance was a single EGRET photon, detected some 4500 s after the initial burst, 
with an energy of 18 GeV |76||. The probability of this photon originating in the 
background was 5x10^^. In fact, EGRET detected a total of 28 photons from GRB 
940217. Ten photons were detected during the first 180 s, concurrent with the BATSE 
detection, including a 4 GeV photon and a 3.5 GeV photon. The remaining 18 
photons were detected over the next 1.5 hours. Hurley et al. |j76| point out a number 
of conclusions which can be immediately drawn. While the universe is optically thin 
at 7-ray energies, the cosmic microwave background becomes an efficient medium for 
7-7 pair production for sufficiently high 7-ray energies. For a 25 GeV photon, this 
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consideration limits the source to z<55. To avoid attenuation from the intergalactic 
infrared background, the source distance is further hmited to z<2.5, ruhng out early 
universe theories 0]. Finally, they note that if the spectrum of the first 180 s of 
GRB 940217 is extrapolated to 1 TeV, the predicted fluxes would be detected by air 
Cerenkov telescopes and the Milagro air-shower array. Unfortunately, bursts with 
such high-energy emission are evidently somewhat rare, and the fields of view of air 
Cerenkov detectors are quite small. Nevertheless, the current network of fast burst 
position detections may allow observation of delayed emission like that of the February 
17 burst. The very long delay of emission has also been the source of much theoretical 
speculation. The existence of such delayed emission is a natural consequence of blast- 
wave models. While the initial 7-ray burst is caused by the formation of a shock with 
the surrounding medium, the blast front will continue to radiate as it decelerates. 

The most recent burst detected in the EGRET spark chamber arrived only a few 
weeks later, on March 1, 1994. This burst was similar to the first two which had been 



detected, with 7 photons of maximum energy 160 MeV, arriving within 20 s ||177 



The spectrum of this burst was softer (~— 2.5) than the February 17, 1994 burst. 

It is unlikely that any more bursts will be observed in the EGRET spark chamber, 
as instrumental lifetime concerns have reduced EGRET to Target of Opportunity 
observation only. Nevertheless, the observations made by EGRET create significant 
constraints on the models used to explain 7-ray bursts. The GLAST instrument, 
with its very wide field of view (§ |6.2| ) and large sensitive area up to very high energies 
(300 GeV), should further constrain the high energy behavior of 7-ray bursts. 



3.4 Possible EGRET-orAy Bursts 

Observations of 7-ray bursts in the last decade have shown conclusively that the 
high-energy emission has a power-law spectrum. If, for some bursts, the burst has 
a harder spectrum than the background, the signal-to-noise ratio should increase at 
higher energies. Some blast wave theories predict that a spectral break may occur at 



or just below EGRET energies |3^. In such models, the total peak spectral power 



early in the burst is emitted above 100 MeV. 
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It is thus possible that some 7-ray bursts will be detectable by EGRET, but 
not by lower-energy instruments such as BATSE. The evidence of such bursts would 



therefore be lurking in the EGRET photon database. As was discussed in § |2.3| , the 
best way to deal with the photon-by-photon nature of EGRET data is through the use 
of likelihoods. Buccheri et al. developed a likelihood method to search EGRET 
data for 7-ray bursts. While they did find the known bursts in the data they searched, 
they apparently did not perform a comprehensive search of the EGRET database. 
The difficulty with such a search is that the exposure for each photon has to be 
calculated individually. This process is very time-consuming, due to the the extended 
nature of the EGRET calibration files. Therefore, a comprehensive search required 
some adaptation of statistical methods ^7j. Those methods, and their results, will 
be described below. 



3.4.1 Statistical Methods 

The 7-ray burst search algorithm was designed to be fast and efficient at finding 
bursts. From a statistical standpoint, a burst may be defined as a time interval with 
a measured rate which is, to some specified confidence, incompatible with Poisson 
fiuctuations. The method, then, is fairly straightforward: first, find the background 
rate; second, find any time intervals with sufficiently high rates to rule out Poisson 
fiuctuations at the given confidence level; and finally, verify that the spatial distribu- 
tion of the photons is consistent with a point source. The sequential nature of this 
method makes it fast compared with a full likelihood analysis; however, it also in- 
volves some binning, and as we will see, complicates our estimates of the significance 
of detections. 

To further speed the algorithm, we search for significant intervals in two steps. 
The first compares photon arrival rates only. The rate is defined as the number 
of photons from some area on the sky per second. It requires only the amount of 
instrument live time and a photon count. The second step will measure the photon 
fiuxes (photons cm~^ s~^). This is certainly the more physically relevant quantity, 
but it is also computationally more expensive to determine the state of the EGRET 
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instrument for the entire interval. Therefore, promising candidate intervals will be 
found by comparing rates, and then checked for significance by comparing fluxes. 

To establish a background rate, the field of view is binned into 5° x 5° squares. 
All photons in the viewing period with energies over 100 MeV and zenith angles less 
than 100?5 are sorted into these bins. The total instrument live time to the center 
of each square is calculated. This yields an array of approximate background count 
rates. The critical number of photons required for a significant interval, Ncru, is given 
implicitly by the Poisson formula 

a= 2^ j 3.2 

where /z is the average rate, t is the time interval being searched, and a is the confi- 
dence level. The probability of observing Nohs>Ncrit is 1 — a. 

To acquire a set of candidate events while avoiding specific time binning, each 
photon is considered to be the start of an interval. All photons arriving within the 
standard interval length (either 1 hour, 30 minutes, 10 minutes, or 3 minutes) and 
within 5° of the initial photon are considered part of the same event and counted. If 
the number of observed photons exceeds Ncrit for that area of the sky, the candidate 
event is accepted for further evaluation. 

Because the number of candidate events which pass the first cut is relatively 
small, it is practical to calculate instrument exposure and compare the candidate flux 
to the expected background flux. The single trial probability of such an event, P^, is 
calculated as in equation (|3.2| ), with the time t replaced by the exposure £ and the 
rate /z replaced with the flux. 

Unfortunately, Ps cannot be interpreted as the probability that the flux in a 
given interval is not a Poisson fluctuation. In fact, we have searched many different 
intervals, and the appropriate confidence level must take into account the number of 
trials. If the intervals had been fixed and preselected so as not to overlap, then each 
interval would be statistically independent. The total probability P/v that a given flux 
could not be attributed to Poisson fluctuations would be given by the complement of 
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the product of the chances that it would not be seen in each of any one intervah 

P^ = l-P^ (3.3) 

where is the number of independent trials. 

However, preselecting non-overlapping time bins would not be a good way to 
search for 7-ray bursts. If a burst did not happen to fall entirely within one bin, but 
instead split its flux between two bins, it would probably not be significant in either 
bin. The sliding interval described above was designed to avoid this problem. Nev- 
ertheless, the sliding interval introduces its own problems: namely, that the intervals 
are no longer independent. 

In order to evaluate the actual significance of an interval Pt as a function of P/v, 
a Monte Carlo simulation was performed. It was expected (and verified) that the 
significance of an interval should be monotonic in P/v (Figure ^3.2] ). Approximately 
5.7x10^ simulated photons were generated for the Monte Carlo data set; the actual 
data set searched contained 1.2x10^ photons. Each Monte Carlo photon was assigned 
an arrival time, drawn from a Poisson distribution with a given background rate. 
Each photon was also assigned a uniformly distributed x and y coordinate between 
0° and 40°, simulating EGRET field of view. The data were simulated using actual 
EGRET exposure information calculated for a typical point on the sky over many 
viewing periods, yielding an exposure set representative of the windowing and non- 
continuous exposure actually obtained with EGRET. This exposure set was used as 
many times as necessary to generate the entire Monte Carlo data set. To determine 
appropriate background rates, 104 actual fluxes were measured from random points in 
differing viewing periods. These fluxes were sorted, and the highest and lowest 20% 
were discounted in order to exclude nonrepresentative outliers. The range of this 
tightened distribution of fluxes was used as the range for the uniformly distributed 
fluxes in the Monte Carlo simulation. This was done to ensure the use of typical, 
though not rigorously representative, background fluxes. Since all probabilities are 
found from the difference between measured and expected fluxes, they depend only 
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Statistic (in units of 1 



Figure 3.2: Overall significance Pt versus raw probability Pn from the Monte Carlo 
simulations. P^ is calculated for each Monte Carlo event from equation ( p.2|) and 
equation (|3.3|). A cumulative count of the Monte Carlo events yielding such a P^ 
yields the cumulative probability of that events, Pt, given by the vertical axis. 



weakly on the precise values of the background fluxes. The range of backgrounds 
found was 7.0x10" ''-4.8x10"^ photons cm"^ s^^ for a 5° circle on the sky. 

In some cases, the search algorithm may have detected the same event more than 
once; that is, the interval following the first photon in the event yielded a significant 
rate increase, and the interval following the second photon also yielded a significant 
rate increase. In these cases, the two probabilities must be independently combined, 
because the Monte Carlo distribution was found by counting all significant events, 
regardless of whether or not they were part of the "same" event. 

In addition to exhibiting a flux increase, 7-ray burst photons should be statistically 
consistent with a point source origin. Candidate bursts found in the time series anal- 
ysis were examined spatially using a likelihood analysis technique [^. Likelihoods 
were calculated for all photons within approximately 20° of the first photon of the in- 
terval. A null model of smooth background was compared to a model with a variable 
position point source plus a background rate taken to be the average background 
found above. To simplify computation, errors in photon position were taken from 
the width of the energy dependent best fit Gaussian containing 68% of the EGRET 
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point-spread function given by equation ( |1.1|) . The usual EGRET test statistic is 
then defined | |117[ as in § p3| : TS = — 2(ln£s — ln£„), where the Cs and are the 



hkehhoods of the source and null models, respectively. The spatial analysis required 
likelihood calculation for each individual photon. The standard EGRET likelihood 
software, LIKE ||117|| , is designed to evaluate likelihoods based on maps of photon 
counts. Separate likelihood software was thus developed for this study. Simplifica- 
tions in this implementation make direct comparison of these values of TS with those 
from LIKE inexact. For the purposes of source detection in EGRET analyses, a TS 
of 16 is considered to be a significant (4cr) detection. This is based on the measured 
distribution of TS. However, the likelihood statistic calculated here suffers from a 
very low count rate; often the likelihoods are computed from less than 10 photons. 
The statistics of TS with very few counts are not well characterized. Furthermore, 
the background rate across the whole 20° is taken to be the same as that found for 
the central 5° circle. In regions where the background is spatially varying, this may 
distort the measured TS. 

Nevertheless, the TS measurement adds valuable information to the evaluation 
of burst candidates. Candidates with a very low TS, corresponding to little or no 
point-like structure, should be discarded. A sharp variation in photon arrival rate 
across a large area of sky is probably not due to an astrophysical process. It should 
be remembered that there is already a selection effect of candidate events due to the 
fact that the time series analysis considers photons confined to a five degree circle. 
This selection effect as well as the low count statistics make translation of TS into a 
confidence level problematic. Note also that the spatial and time series probabilities 
are not completely independent. 



3.4.2 Results 

We analyzed 182 viewing periods using these methods, corresponding to observations 
between 1991 April 22 and 1996 March 21. Viewing periods after this time used new 
instrument modes to compensate for various instrument malfunctions and narrow- 
field viewing. This data was not searched, since the field of view was significantly 
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Date 


Time 


e 


6 


TS 


Number 
Expected 


Number 
Observed 


Max Energy 
(MeV) 


Pt 


1993 Jan 31 


18:57:12 


287 


51 


33.1 


0.048 


6 


1240 


>99.994% 


1994 Feb 17 


23:03:05 


152 


-55 


18.8 


0.074 


5 


3382 


98.8% 


1994 Apr 27 


01:31:01 


121 


-0.7 


26.7 


0.120 


6 


680 


95.4% 


1993 Mar 17 


06:40:42 


65 


21 


17.1 


0.009 


3 


361 


52.2% 


1992 Oct 8 


04:35:04 


201 


31 


15.7 


0.047 


4 


508 


32.9% 


1991 May 4 


05:16:33 


204 


13 


18.7 


0.011 


3 


960 


29.0% 



Table 3.1: Three minute time scale results. Burst candidates are listed with their 
times (UT), galactic coordinates £ and b of the maximum likelihood position, spatial 
TS, expected and observed numbers of photons, highest photon energy, and signifi- 
cance, as derived from the Monte Carlo simulations. 













Number 


Number 


Max Energy 




Date 


Time 


e 


b 


TS 


Expected 


Observed 


(MeV) 


Pt 


1993 Jan 31 


18:57:12 


287 


51 


33.1 


0.160 


6 


1240 


99.85% 


1993 Jul 20 


07:07:55 


121 


42 


23.2 


0.097 


5 


878 


74.0% 


1993 Feb 22 


06:46:44 


20 


-52 


21.8 


0.099 


5 


615 


72.2% 


1992 Mar 22 


08:25:50 


333 


0.4 


29.6 


0.623 


8 


11100 


28.1% 



Table 3.2: Ten minute time scale results, as in Table 3.1 



smaller, and the systematics of the new modes are not as well understood. Each 
viewing period was searched on four time scales: one hour, 30 minutes, 10 minutes, 
and 3 minutes. Results are presented in Tables |3.1| — In each table, Pt is the 
probability that no such events would be detected in the entire EGRET data set for 
that time scale in the null hypothesis. All fluxes quoted below are found by dividing 
the number of photons by the total exposure in the scale time, even if the photons 
evidently arrived in a much shorter time. For events with very few photons, the 
flux is not very well defined, since an arbitrary change in cutoff time may reduce the 
exposure without changing the number of photons. Thus, the flux for the purposes 
below is always found by considering all of the exposure in the period. 

Two of the bursts triggered by BATSE, the Super Bowl Burst of 1993 January 31 
and the 1994 February 17 burst, were also independently detected with this algorithm. 
The other three EGRET-detected BATSE-triggered bursts were not strong enough to 
be independently detected. For comparison, these bursts and their characteristics are 
listed in Table Several independent detections were made, the most significant 
occurring on 1994 April 27. 
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Number 


Number 


Max Energy 




Date 


Time 


e 


b 


TS 


Expected 


Observed 


(MeV) 


Pt 


1993 Jan 31 


18:57:12 


287 


51 


33.1 


0.387 


8 


1240 


97% 


1993 Jul 20 


06:50:45 


120 


42 


28.8 


0.205 


6 


880 


77% 



Table 3.3: Thirty minute time scale results, as in Table p.l| . 

Number Number Max Energy 

Date Time £ b TS Expected Observed (MeV) Pt 

1993 Jan 31 18:57:12 287 51 33l 0.387 8 1240 97% 

1994 Apr 27 21:34:18 169 3 26.0 1.146 10 522 70% 



Table 3.4: One hour time scale results, as in Table 3.1 



3.4.3 Discussion 

Each Pt is the chance that measured photon arrival times are not due to random 
Poisson noise in the entire EGRET data set for that particular time scale. Thus, an 
event Pt = 90% in the one hour time scale would be a spurious detection in 10% of the 
ensemble of possible EGRET data sets searched on a one hour time scale. However, 
four time scales have been searched. If each time scale were independent, this would 
be counted as four trials. But the same data were searched in each time scale, so the 
trials are not independent. Rigorous evaluation of the overall number of independent 
trials is problematic, and is not attempted for fear of producing meaningless results. 

The calculated probabilities also do not take into account the photon energies. 
However, high energy photons are rarer than lower energy photons, so the arrival of a 
similar number of high energy photons would constitute a more significant event. If 
the detected bursts exhibited a plethora or paucity of high energy photons, the event 
energy spectrum would be expected to differ appreciably from that of the background. 
Because each event has too few photons to meaningfully determine a spectrum, a 
hardness ratio was calculated. All photons from all bursts in each time scale were 
collected, and a hardness ratio for the set was calculated. Even such a collection 
suffers from poor counting statistics, as evidenced by the large errors (Table |3.6|) . 
The hardness ratio is defined as the number of photons with energies greater than 
300 MeV divided by the number with energies between 100 and 300 MeV. A typical 
background spectral index of 2.0 corresponds to a hardness ratio of 0.5. The measured 
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Date 


Scalp 


Number 
Expected 


Number 

Observed Ps 




Pt 

1 




1991 Mav 3 


3 min 


0, 


.202 


3 


99, 


88% 


< 


10" 


-m 




10 min 


n 

yj . 


679 


U 


QQ 




< 


10" 


-300 




.'^n min 


I. 


.288 


5 


98, 


.97% 


< 


10" 


-300 




1 liUUi 


I. 


.288 


5 


98, 


.97% 






-300 


1991 June 1 


3 min 


0, 


.396 


4 


99, 


.93% 


< 


10- 


-300 




10 min 


0, 


.633 


5 


99, 


.95% 


< 


10- 


-300 




30 min 


0, 


.633 


5 


99, 


.95% 


< 


10- 


-300 




1 hour 


2, 


.064 


9 


99, 


.97% 


< 


10- 


-300 


1994 March 1 


3 min 


0, 


.102 


3 


99, 


.98% 


< 


10- 


-300 




10 min 


0, 


.333 


4 


99, 


.96% 


< 


10- 


-300 




30 min 


0, 


.569 


4 


99, 


.72% 


< 


10- 


-300 




60 min 


0, 


.569 


4 


99, 


.72% 


< 


10- 


-300 



Table 3.5: Known i?^r5'£'-triggered 7-ray bursts not detected independently by 
EGRET in each time scale. The expected and observed numbers of photons are 
given, along with the single trial probability Ps and the significance Pt as deter- 
mined by the Monte Carlo. 

hardness ratios are consistent with the background hardness ratios, implying that the 
Pt would not be significantly modified if energy information were taken into account. 

All probabilities are dependent on the statistical distribution found from the 
Monte Carlo simulation. Thus, any possible factors which might make the Monte 
Carlo an imperfect simulation of the real data must be examined. Edge effects may 
slightly affect the final distribution, since the 5° circles drawn around photons near 
the edge of the region will reach past the edge. However, a short simulation was done 
in a 20° by 20° square, and the resultant distribution did not differ significantly from 
that calculated with a larger area, indicating that edge effects play a small part. 

The Monte Carlo simulation also assumed spatially uniform exposure and back- 
ground rate in order to make computation time reasonable. These assumptions will be 
poor only when the exposure or rate varies significantly over the 5° circle. Exposure 
generally varies smoothly, so this is probably a good approximation. The background 
rate can vary extensively if a strong point source is nearby, but the diffuse background 
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Time Scale Hardness Ratio 

3 min 0.723 ± 0.338 

10 min 0.385 ± 0.202 

30 min 0.500 ± 0.433 

1 hour 0.500 ± 0.306 

Table 3.6: Hardness ratios for each time scale. The hardness ratio is defined as 
the number of photons above 300 MeV divided by the number of photons between 
100-300 MeV. 

does not vary rapidly on this scale. Thus, we may need to worry about the distribu- 
tion if events in the real data are detected near steady point sources; otherwise, the 
constant background approximation should be relatively good. 

Spatial correlations must also be considered in the evaluation of burst candidates. 
Unfortunately, rigorous treatment of spatial correlations is fraught with difficulty. 
The maximum likelihood distribution is not well characterized for low counts. Fur- 
thermore, the 5° search radius used will select for spatially correlated events. It was 
hoped that spatial analysis would add significantly to our understanding of these 
events; unfortunately, because of these biases and correlations the spatial analysis 
has yielded little additional insight. 

The most interesting candidate event is clearly the 1994 April 27 01:31 event 
in the 3 minute time scale. The significance of this event, while much lower than 
the 5>ir»S'£'-independent detection of the Super Bowl Burst, is almost as high as 
the S^lTO^^-independent detection of the 1994 February 17 burst. This detection 
occurred while EGRET was in its most common, largest effective area mode. The 
Earth zenith angle to the event was ~ 40°, well away from the horizon. It was 
observed 27° off the instrument axis. It thus seems unlikely that it is a spurious 
detection caused by pushing the operating envelope of the instrument. In addition 
to the six photons within three minutes of elapsed time classified as good events, 
there were seven additional events which were rejected by the standard EGRET data 
analysis for a variety of reasons. One of these generated too few sparks in the spark 
chamber, one had its vertex in the wall of the chamber, and the rest failed to produce 
an acceptable time of flight measurement. Estimated trajectories of these events 
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suggest that they originated from the same area on the sky as the good events. 
Although a precise calculation of the significance of these events is impossible, there 
appear to be more such events from that direction on the sky than would otherwise 
be expected. 

The time interval corresponding to the 1994 April 27 candidate was examined 
in the 30 - 100 MeV range as well. No photons were detected during that period. 
However, EGRET sensitive area to this energy range is comparatively small, so that 
only 0.046 background photons were expected (compared to 0.120 in the 100 MeV and 
above range.) The TASC and anti-coincidence dome rates did not show a significant 
change in rate. With such small numbers, it is difficult to assess the significance of 
the lack of low energy photons. 

Furthermore, a check was done of BATSE detections nearby. No trigger occurred 
at that time, although BATSE triggering was enabled. The last previous trigger 
occurred some 20 hours earlier. 

No other bursts were detected so significantly. However, there were several less 
significant detections, making it worthwhile to consider the probability that at least 
one 7-ray burst has been detected. In the three minute time scale, the April 27, 1994 
burst is fairly significant by itself. However, it is interesting to calculate the joint 
probability that all the events in each time scale are due to random Poisson noise. 
We will exclude the Super Bowl Burst and the February 17 Burst, since they are 
already known to be a 7-ray bursts. Table shows the joint probabilities in each 
time scale from the arrival time data. There is apparently some evidence for burst 
occurrence in the short time scales, with no significant evidence for occurrence in the 
half-hour and hour long time scales. 

The detected candidate events were compared to the Second BATSE Burst Cat- 
alog to see if any coincident detections were made. With the exception of the bursts 
actually triggered by BATSE, that is, the Super Bowl and February 17 bursts, no 
coincidences were found. 

While the Poisson error analysis may be encouraging, it is important to consider 
the contribution of systematic errors before coming to any conclusions. The search 
for 7-ray bursts by the method used is most sensitive to systematic errors in the 
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Time Scale 



Probability 



3 min 
10 min 
30 min 
1 hour 



99.0% 
94.8% 
77% 
70% 



Table 3.7: Joint probabilities of detecting all the marginal candidates in each time 
scale. While systematic errors could be responsible, there is apparently evidence for 
burst activity at or below the three minute time scale. 

determination of background rates. If the background is consistently underestimated 
relative to the burst photons, many apparently significant events are not actually 
significant. 

The most likely source of systematic errors in the background rate is the error 
in the estimate of the instrument exposure. A subtle point should be explored here. 
EGRET switches observing modes every few minutes. Each mode has a different 
amount of sensitive area to a given position on the sky, as well as a different field of 
view. Approximately 80% of the time EGRET is in the single mode with the largest 
field of view and sensitive area. In the limit that the instrument is always in the same 
observing mode, any exposure errors will cancel each other, since the error induced in 
the background flux will be exactly the same as the error induced in the burst flux. 
However, if the exposure error is different for each instrument mode, then a burst 
candidate detected in a rare mode might exhibit considerable systematic error. 

The sensitivity of EGRET diminishes as the gas in the spark chamber ages. This 
effect has been partially characterized and compensated for in the EGRET exposure 
data. The time scale for this drift is typically several months, much longer than the 
burst search scales and also at least twice as long as the longest background averaging 
time. 

The nondetection of the 1994 April 27 candidate burst by BATSE must be con- 
sidered in light of current models for 7-ray burst production. The relativistic fireball 
model of Meszaros, Rees, and Papathanassiou | p.28| | suggests the possibility of a high 
energy fiat spectrum tail. If the spectrum is fiat down to the BATSE energy range, 
the signal would be lost in the soft-gamma background. However, that model does 
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not predict a flat spectrum at BATSE wavelengths. Significant redsliift effects might 
move the break of the fireball model spectrum to below BATSE energies, but then 



x-ray detection would be expected. Nevertheless, other models ||3^ suggest that at 
the earliest times after the burst, the spectral break could be at EGRET energies. 



3.5 Conclusions 

The mystery of the 7-ray bursts has recently experienced its third major revelation. 
The first occurred in the late 1960s and early 1970s, when the energetic bursts were 
discovered. The next occurred in 1992, when BATSE made clear the isotropic, non- 
homogeneous distribution of the bursts — pointing to the almost inconceivable conclu- 
sion that the bursts were of cosmological origin. The third was the association, first 
by BeppoSAX , and then by many others of a few bursts with counterpart afterglows 
in many wavelengths, resulting in the measurement of cosmological redshifts. 

One of the outstanding questions about 7-ray bursts has thus been answered. 
Nevertheless, we are far from a complete understanding of the burst mechanism. The 
central engine is probably either a merger of neutron stars or a hypernova. Energetic 
concerns suggest fairly strongly that the initial energy release goes into baryonic 
kinetic energy, and that the observed burst is radiation from a shock front created 
when the outflowing baryons sweep up surrounding material. The existence of shock 
fronts suggests high-energy power law emission, but the exact mechanism is still 
unknown. 

Given the uncertainty of the field, it was judged prudent to examine the EGRET 
data for the presence of 7-ray bursts not detected in other wavelengths. One pos- 
sible detection of a high-energy 7-ray burst event has been found in EGRET data 
independent of a trigger from BATSE. Two previously detected 7-ray bursts were 
independently detected, verifying the search method. The new event, occurring on 
1994 April 27, was detected with a statistical significance of 95.4% on a 3 minute time 
scale although this does not include systematic errors. Pointlike structure and the 
presence of several additional photons which converted in the walls of the EGRET 
instrument lend qualitative support to the detection. 
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Taken together, several less significant events suggest burst activity on the three 
minute time scale with probability 99.0%. Activity is suggested at a somewhat lower 
probability on the ten minute time scale, and at significantly lower probability on 
thirty minute and one hour time scales. 



Chapter 4 

Periodic Time-Series Analysis 



The full-sky map of the 7-ray sky that EGRET made in its first year of opera- 
tion provided unprecedented discovery and localization of 7-ray sources (Table |1.1| ). 
However, EGRET also provides excellent arrival time information about individual 
photons. Combined with good directional information, this timing data can be used 
to look for coherent periodic variation. Six pulsars have been positively identified in 
EGRET data, and the precise rotation of pulsars makes them an obvious candidate 
for periodic analysis. X-ray binary systems also exhibit regular periodic behavior in 
the X-ray bands; it may be hoped that some of these could be detected by EGRET 
as well. 

Since there have been extensive EGRET observations of pulsars, we will begin by 
looking at pulsar models for which EGRET (or GLAST) may have discriminating 
power, as well as the contributions made to pulsar understanding by EGRET so far. 
We will develop the necessary statistical apparatus to search for periodicity in a 7- 
ray signal. We will then see how to apply likelihood statistics to pulsar analysis, and 
the results of that analysis. Finally, we will see how the statistical tools for periodic 
time-series analysis may be applied to X-ray binary searches. 
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4.1 Pulsars 

Two years before the Vela satellites would begin to secretly detect 7-ray bursts, 
Jocelyn Bell and Anthony Hewish detected a periodic radio signal that pulsed every 
1.3 s |j70|. The signal remained coherent for days at a time. However, the mysterious 



source of the signal would be divined before the 7-ray bursts would be declassified. 
The discovery of the Vela pulsar, with a period of 89 ms | |106[ , and the Crab pulsar. 



with a period of 33 ms ||183|| , would narrow the field of prospective sources to one: a 



rapidly rotating neutron star. Such an object is the only astrophysical object capable 
of radiating the power observed at the pulse frequencies observed without flying apart 
under its own rotation. 

The basic description of the pulsar was worked out over the next few years. It 
had long been realized ||2^ that a sufficiently massive body would overcome its elec- 



tron degeneracy pressure by self-gravitation. This situation can come about when a 
massive star {>Mq) exhausts its nuclear fuel. Without the energy released by the 
fusion reactions in the core, the star collapses, the gravitational potential condenses 
the electrons and protons, and a central core of neutrons develops. The remaining 
outer material is blown off in a catastrophic explosion: the supernova. Conservation 
of some fraction of the original stellar angular momentum results in the central neu- 
tron star rotating with a frequency of 0.1-100 Hz. Conservation of the magnetic flux 
in the star (assuming it remains a perfect conductor) results in a magnetic fleld at 
the neutron star surface of order 10^^ G. The observed flux from pulsars, then, is the 
result of a rapidly rotating magnetic dipole (e.g., [^). 

Some basic pulsar parameters may be inferred from elementary mechanics. If we 
make the assumption that all radiated pulsar power comes from the rotational energy, 
then the power is given by the time derivative of the rotational energy ^I^^, where 
/ is the moment of inertia and Q is the rotation frequency. Converting this to the 
pulsar period and period derivative, we have 

E = {2'nflp/p^ (4.1) 



Making some basic assumptions about a pulsar as a rotating dipole, Ostriker & 
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Gunn [153 find that the radiated power can be related to the magnetic field: 



. 2 B'a'a' , 

where B is the surface magnetic field and a is the radius of the pulsar (typically 
10-15 km). Combining these two, we arrive at an estimate of the field strength in 
terms of the period and period derivative 

= li^nr Ipp (4.3) 

Typical values of pulsar parameters are given in Table [4.1| . For any given pulsar, we 
assume that its radius and moment are constant. If we assume also that its magnetic 
field remains constant, we arrive at the differential equation pp = k governing the 
pulsar's age. This equation is integrable, yielding 

^-{p'-pl) = kr (4.4) 

where po was the pulsar period at time zero, and is a constant formed from the 
parameters in equation ( [4.31) . By convention, we assume that the final period is 
much larger than the initial period, and, solving for r above, designate the pulsar's 
characteristic age by 

T = -p/p (4.5) 

The characteristic age seems to be a reasonable approximation to the actual age for 
the only pulsar we can verify. The Crab pulsar was born in a supernova which was 
observed at Earth on July 4, 1054 [|118| , |197|| . The Crab rotates with a period of 
~33.39 ms, and has a period derivative of ~4.21 x 10"^'^, yielding a characteristic 
age of about 1280 yrs. This is correct to within 30%, but the assumption of linear 
spindown is probably least accurate at very short times after the supernova explosion. 
This effect will diminish with pulsar age, while glitches in the pulsar period and its 
derivative will add error to our age estimate. One possible mechanism is gravitational 



wave emission due to mass multipoles |p.53|| . Ostriker & Gunn find that to fully 
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Parameter Typical Values 

Period p 1.6 ms-6 s 

Period derivative p lO^^^-lO^^^ s/s 

Magnetic Field lO^^^^o^^ G 

Characteristic Age 10^-10^ yr 

Moment of Inertia ~10^^ g cm^ 

Radiated Power E ~10^^-10^^ erg s~^ 

Table 4.1: Typical pulsar parameters for 7-ray pulsars. 



account for the age difference, approximately 1/6 of the Crab's current luminosity 
would be in gravitational waves. 

Some elementary physics gives a reasonable first approximation to a pulsar model. 
Nevertheless, physics is a fractal subject, and the number of questions still to be 
answered is seemingly independent of the number of questions already answered. 
Fortunately for EGRET, 7-rays turn out to be an excellent window on pulsar energy- 
generation mechanisms. 



4.1.1 EGRET Contributions 

The EGRET instrument along with the other instruments on board the CGRO has 
revolutionized our understanding of at least one class of pulsars. Extensive reviews 
of those contributions [^, p,46| , |152| , |191| , |194[ have been published elsewhere; we 



will only review their highlights. 

Six pulsars have been unambiguously discovered in high-energy 7-rays. All six 
share some common features. First of all, they tend to have double-peaked light 
curves, with the peaks separated by somewhat less than 180°. 7-Ray pulsars tend 
to have harder spectra than most sources (spectral indices range from —1.4 to —1.8, 
compared to the average 7-ray source spectral index of ~2.3). Five of the six have 
among the highest measured values of E/D^, the radiated power divided by the pulsar 
distance squared. E / D"^ should be proportional to the observed power at Earth. It 
may be calculated by assuming that all of the kinetic energy lost from the pulsar 
spin-down is radiated isotropically, and that the radio dispersion measure gives a 
good distance estimate to the pulsar. 
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The Crab pulsar was detected by both SAS 2 [ [189|| and COS i? @, ^ in 7-rays 



some ten years after its initial radio discovery | |183| |. It was strongly detected in 
many different EGRET viewing periods, yielding high quality light curves and spec- 
tra [p.51| , |202| , ^ |4^ . All 7-ray pulsars are observed to have very steady luminosities; 
in fact, they have been used as constant calibration sources for some variability stud- 
ies ||121|| . Early observers suggested, however, that the ratio of the luminosity in the 
two 7-ray light curve peaks of the Crab pulsar varied with a 14-year cycle | |213| , |155| |. 
However, further observations ruled out significant variation on timescales of order 



14 years gUg . 

When it was first discovered in radio observations ||106|| , no one expected the 
Vela pulsar to be the brightest object in the 7-ray sky. Nevertheless, it was the 
strongest source detected hj SAS 2 | |19(]| |, and remains the brightest steady source 
in the EGRET catalog [^. Although there is significant timing noise (that is, the 
pulsar period fluctuates on short time scales). Vela is easily detected in a number of 
EGRET viewing periods ||94|| . 

The most fascinating 7-ray pulsar is Geminga. The third brightest object in the 
7-ray sky after Vela and the Crab, it was unique when discovered by SAS 2 in 1975 
in that it had no obvious counterpart in other wavelengths |98| , Most notably, 
it had no radio counterpart — a prominent feature of all other known pulsars. It 
would be 17 years before pulsation was discovered in the X-ray emission ^8[. Using 
the X-ray ephemeris, pulsation was quickly detected in EGRET [0, SAS 2 [|114|| , 
and COS B [|^] data. Indeed, the 7-ray signal is so strong that various period 
searching techniques would have independently discovered the pulsation in Geminga 



without the aid of an X-ray ephemeris (§ 4.2.3 , ||116|| ). Recently, there has been an 
unconfirmed report of pulsation detected in radio [ |102| ]; whether or not this detection 
can be confirmed, the radio emission is clearly very weak. 

The remaining three 7-ray pulsars are less well-known, but no less important 
to our understanding (or lack thereof) of 7-ray pulsar physics. PSR 1706-44 was 
only discovered as a radio pulsar in 1992 |]79[ and soon thereafter in the EGRET 
data [|192|]. Unlike the other five 7-ray pulsars, which have two peaks separated 



by 140°-180° in phase, PSR 1706-44 exhibits a single peak in its 7-ray light curve. 
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Unexpected in its own way is PSR 1055-52. It was discovered as a 7-ray pulsar in 
EGRET data in 1993 [^|, although it ranks approximately 29th in known E/D"^ 



Why PSR 1055-52 is observed when other, apparently more luminous pulsars are not 
remains a mystery. As a final confusion, PSR 31951+32 — fourth in E/D"^ — was not 
observed as a pulsed source in EGRET Phase I. Further pointed observations in 



Phase III resulted in a weak pulsed signal ||167 | 



EGRET observations of 7-ray pulsars pose a unique challenge for theorists. The 
small number of pulsars and the wide variety of observed behavior makes it difficult 
for any one model to match all the observations while remaining simple enough to 
make concrete predictions. Nevertheless, two main classes of models dominate the 
thinking of most pulsar theorists today. 

4.1.2 Models 

The theory and ramifications of 7-ray pulsar models are extensively discussed else- 
where ||112| , |4^ , |191|| ; we will review the highlights of the two major models. Most 



current 7-ray pulsar models may be categorized as either "polar-cap" or "outer-gap" 
models. These two classes differ mainly in the region where the energy is emitted. 

Polar-cap models 



The polar-cap models build on the early pulsar models of Goldreich & Julian | J54| | , 
which defined the standard pulsar framework. Although many researchers have ex- 
tended their work, Daugherty and Harding are largely responsible for developing the 
model for 7-ray emission pT] , |32[] . The "polar cap" is a region at the magnetic pole 
of the pulsar, which in general is not aligned with the rotation axis. Some magnetic 
field lines emerge from one pole and re-enter the pulsar at the other pole. However, 
since the pulsar is rapidly rotating, there is some radius (given by r = Q/c) at which 
the field lines can no longer corotate with the pulsar without exceeding the speed of 
light. This radius defines the light cylinder. The closer the point from which the field 
line emanates is to the magnetic pole, the closer the field lines will approach the light 
cylinder. Field lines which emanate from within some critical radius will not return 
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to the pulsar; instead, they will close at infinity. The polar cap is that region around 
the magnetic pole inside the critical radius. 

The polar cap is important because free charges placed on open field lines will 
escape to infinity. Potential differences across the polar cap can rip charge off of the 
pulsar surface. This charge then flows along the curved fleld lines, emitting curvature 
radiation. At the lowest altitudes above the pulsar surface, this curvature radiation 
is so energetic that it pair-produces in the magnetic fleld. Eventually, the charged 
particles have lost enough energy that the photons (7-rays) no longer pair produce. 
These 7-rays are observed by EGRET. The details of the fleld shapes and the emission 
regions determine the 7-ray and radio light curves. Currently, it seems as if many 
7-ray pulsar characteristics can be explained with the polar cap model. However, it 
has been suggested that general-relativistic frame- dragging effects may more strongly 
influence the 7-ray emission than the precise shape of the magnetic fleld ||144| , |143| . 



Outer-gap models 

A substantially different model of energy emission is the "outer-gap" model, flrst 



proposed by Cheng, Ho, & Ruderman pBI, p^. In these models, vacuum gaps may 



arise along the last closed fleld lines in the pulsar magnetosphere. The gaps separate 
charge of different sign on opposite sides of the last closed fleld line, causing signifl- 
cant electric potentials. Charged particles may be accelerated across these potentials, 
following the magnetic fleld and emitting curvature radiation. The details of the 
shapes of the emitting regions and the resultant light curves are not immediately ob- 
vious, but extensive work has been done to calculate the impact of the magnetosphere 
geometry on the 7-ray observations |171, The outer-gap models also seem to 



describe many of the qualitative features observed in 7-ray pulsars. 
Remaining Questions 

It is still not clear which model of pulsar emission more accurately describes 7-ray 
pulsars. In general, 7-rays emitted from outer-gap regions will be emitted in a more 
fan-like pattern, while 7-rays emitted from polar-caps will be more tightly beams. 
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However, both models can reproduce a subset of the pulsar characteristics observed 
in EGRET data |TU^, 

Unfortunately, the sample of observed 7-ray pulsars is still small, and it is difficult 
to determine which observed characteristics are generally representative and which 
are particular to the observed pulsars. It is therefore critical to increase the number 
of observed 7-ray pulsars. The GLAST mission will bring a superior 7-ray telescope 
to bear on the problem (Chapter However, it is possible that there are additional 
Geminga-like radio-quiet pulsars which have already been observed by EGRET. Our 
task is to find likely pulsar candidates, and determine their pulsation parameters. 

4.2 Statistical Methods 

Searching for pulsation in EGRET sources poses a unique set of problems to the 
pulsar researcher. Dispersion, the scourge of radio pulsar searches, is non-existent 
in 7-rays. However, since the fiux is quantized into 7-rays, EGRET might detect a 
photon from a pulsar only once per thousand rotational periods. Long integrations of 
coherent pulsar emission are required to resolve the periodic behavior. Furthermore, 
the particular structure of EGRET data, organized by individual photon information, 
is significantly different than the flux information found in other wavelengths. Low 
count rates mean that the instrument state can vary signiflcantly between the arrival 
of individual photons. The low rates also require that as much information as possible 
be extracted from each photon. 

For these reasons, special statistical care must be taken in analyzing EGRET data 
for pulsar signals. Extensive coherent pulsation analysis has been already been done, 
and it will be instructive to examine the challenges speciflc to EGRET previously 
overcome (§ [4.2.1| ). First, however, we must approach some difficulties basic to all 
pulsation analysis. 

In order to be sensitive to pulsed emission from the fastest pulsing sources known 
(a few milliseconds), we must have timing accuracy to a fraction of the pulse period. 
EGRET records photon arrival times to an absolute accuracy of 100 fis, two and a 
half orders of magnitude smaller than the period of the Crab. However, the radius 
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of the Earth's orbit is 8 hght-minutes. Photons from source observations separated 
by six months time will have light travel times differing by hundreds of thousands of 
pulse periods. In order to compensate for the motion of the Earth, we will translate 
all measured arrival times into Solar System Barycenter (SSB) times. The SSB time 
is the time when a photon would have arrived at the solar system barycenter if none 
of the mass of the solar system were present. The conversion between measured time 
(tuTc) and barycenter time th has been worked out in detail elsewhere |^3|, p,88| ]. The 



expression will be motivated and examined in Appendix ^ the schematic result is: 

tb = ^UTC + ^convention + ^location + ^Einstein + ^Shapiro (4-6) 

where adjustments to Universal Time are made to correct for bookkeeping conventions 
such as leap seconds, for the current location of the spacecraft with respect to the 
barycenter, for the gravitational redshift ( "Einstein delay" ) effects, and for the delay 
due to the gravitational potential ("Shapiro delay"). The full result is given by 



equation (|A.5|) . 



4.2.1 Previous Methods 

Two methods have been used to examine coherent periodicity in EGRET data. The 
first one, used for analysis of pulsars with known ephemerides, was implemented by 



Joe Fierro in a program called PULSAR [42|. PULSAR selects photons within an energy- 



dependent cone about the source position. The acceptance radius, chosen to maximize 



the signal-to- noise ratio, is given by equation ( |1 . 1| ) . The pulsar period, along with 
the first and second period derivatives, are used to assign to each photon a phase, 
corresponding to the fraction of a rotation through which the pulsar has turned since 
a reference time to- Recall that a 7-ray will arrive, on average, only once per hundred 
or thousand rotations of the pulsar; by retaining only the fractional rotation, the 
emission from many rotations is added together coherently. This process is known 
as epoch folding. Plotting the resulting photon rate as a function of phase yields a 
light curve — the average profile of emission through a rotation. Photons can then be 
selected as a function of phase for further analysis. 
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The phase of a photon is determined from the pulsation period and its derivatives, 
as measured from some reference epoch: 

^{t) = 0(to) + i^{t - to) + l/2u{t - tof + l/6i){t - tof + ... (4.7) 

where u = 1/p is the pulsation frequency and p the period. In practice, it is only 
rarely that the second-derivative term of the Taylor expansion is important; higher 
orders are negligible. The constant term in the expansion is kept only to allow 7-ray 
light curves to be compared with radio light curves. 

Of primary importance in such an analysis is the determination of the presence 
of intensity modulation at the known frequency. Ephemerides for radio pulsars have 
been collected into a database stored at Princeton University [|187|| . 7- Ray photon 



phases based on these ephemerides may be tested for periodicity. 
The test 

The simplest test is the test. Photons are sorted into m bins, based on their 
phase. Since photon count statistics are distributed as a Poisson, the variance in the 
number of photons in each bin is Nm, the number of photons in the bin. The usual 
statistic is calculated, and the probability of the observation under the null model is 
computed. A sufficiently low probability of observation under the null model is taken 
as evidence for variability. While this method is simple, it has a number of undesirable 
aspects. First, the significance of the detection depends strongly on accidents of bin 
boundaries — a problem inherent to binning schemes. Second, the is invariant under 
permutations of bins. We expect pulsar light curves to smoothly vary, with one or 
two main peaks, and would like our statistical method to have some power to identify 
such light curves. 

The test 

The test was developed to try to address some of these difficulties ^ 
The basis of this test is the realization that the light curve is well represented by the 
phase density /(0), which is the fraction of the integrated intensity in each dep. In 
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its simplest form, corresponding to m = 1 and known as the Rayleigh statistic [ 113 
this phase density will be compared to a sinusoid of unknown amplitude and phase. 
The general form uses more terms of the Fourier expansion. The best estimate of the 
phase density function is a series of 5-f unctions: 

1 ^ 

/(<^) = ]vE^(</>^) (4-8) 
1=1 

We will define /m(0) as the function composed of the first m Fourier components of 
f{(p). To calculate the statistic, we determine a comparison (or model) phase 
density function fm{4>)- We then find the squared difference from fm{4>) and fm{(p) 



Zt = 2tiN 



2tt 







2 



fm{<P)-fm{<P) d<P (4.9) 



If we take the comparison density function to be constant, this is simply computed; 
it can be shown to be equal to 2N YJk=i{,Oi\ + /?|), where and [3k are the even and 
odd Fourier coefficients from fm{<P)- -^m asymptotically distributed as with 2m 
degrees of freedom 

This approach clearly mitigates some of the problems with the test. Primarily, 
photon binning has been eliminated. In addition, the Fourier decomposition expects 
a periodic signal — the degeneracy to permutation of fiux peak location is removed. 
Unfortunately, the Z^ statistic has shortcomings of its own. The significance of a 
detection is a strong function of the number of harmonics chosen. It is not difficult to 
see why this should be the case. Retaining only m harmonics of a sum of 5-functions 
amounts to smoothing the observed phase density function. Too much smoothing (a 
very small m) washes out any signal that may be present. Too little smoothing (a 
very large m) results in the signal being swamped by noise from the higher harmonics. 
To ameliorate this difficulty, another statistic was invented. 



The H-test 

In the quest to obtain more and more significant results, it naturally occurred to 
astrophysicists to try a variety of values of m, and see which one results in the most 



CHAPTER 4. PERIODIC TIME-SERIES ANALYSIS 



73 



significant value of Z^. Fortunately, de Jager et al. realized that the set of 
obtained in this way would no longer be distributed as xim- ^^is point, there is 
no further guiding insight; an ad hoc limit of 20 is set on m, and the distribution of 
H = maxi<m<20 Z^—4m+4 is found from a Monte Carlo simulation. The inclusion of 
the term involving m favors models with fewer Fourier components, and the constant 
ensures that the statistic will be positive. This test has become the primary tool to 
search for a pulsed signal in EGRET data [P^]. 



Limitations of these statistics 

Statistics for analyzing pulsar data have evolved, with each generation of statistics 
alleviating some of the problems of the previous generation, but creating problems of 
its own. The discussion of likelihood statistics in §2.3| sheds some light on why these 
difficulties arise. A full discussion of the application of likelihood statistics to this 



problem will be given in § |4.2.2| ; here we will quickly examine the above statistical 
measures. 

Since the H-test is the most sophisticated measure, we will focus on it. If we 
permit ourselves to use some Bayesian language, we see that the goal of the H-test 
is to eliminate the nuisance parameter m. The method is really to take H a.s a 
function of m, and maximize this H{m) with respect to m. Each Z^ has the form 
of the squared difference between the null model and a representation of the data 
comprised of m harmonicsQ. This is just the logarithm of the likelihood of the "data 
representation," assuming a null model and Gaussian errors of variance N^- The 
constant term only insures positivity; clearly it does not affect the distribution of the 
statistic. The only remaining term contains an m. To see its significance, let us write 
the exponentiation of H{m) — that is, instead of the logarithm of the likelihood, the 



*Note that each representation, labeled by m, is derived from the data, which is not permitted 
in Bayesian analysis. Therefore, the likelihood formed is not "the likelihood of the data, given the 
model," but "the likelihood of the representation, given the model." Since the representation is a 
(lossy) smoothing of the data, we have no reason to expect a one-to-one relationship between the 
likelihoods. Nevertheless, the qualitative argument is still valid — this only means that multiplying 
the by a prior and integrating over m would still give incorrect numerical results. 
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likelihood itself. 

expi7(m)=exp(Z^-4m + 4) = (eV^™)exp J [U{<P) - fm{<P)f d<j) (4.10) 

The integral is the likelihood; it is the exponential of a sum of squared differences, 
which of course is a Gaussian. The first term, e"^™', is readily identified as the 
prior! The Bayesian would now integrate[| H{m) over m to eliminate the unnecessary 
parameter m as described is § |2.6.3| . However, if H{m) is at all peaked as a function 
of m, then e^*^"*^ is even more peaked, and the integral of e^*^™^ over m is very nearly 
equal to the maximum. The if-test, then, is an approximation to the integral over 
m of the likelihood times a prior of e"^™. 

This example makes the comparison of the frequentist and Bayesian schools quite 
explicit. The Bayesian makes arbitrary, subjective choices of priors. The frequentist 
makes arbitrary, subjective choices in the method of smoothing — for example, it is 
not clear that the basis of sines and cosines is the optimal smoothing for the H-iest. 
The advantage to the Bayesian method is that it makes explicit when the subjective 
choices are to be made, and completely prescribes the rest of the analysis process. 

As we have seen, the further disadvantage is that the scientist creating new statis- 
tics cannot always consider all the ramifications of his choices. Unfortunately, the H 
statistic cannot be easily adapted to a Bayesian analysis; for while Z"^ may appear 
to be a true likelihood, it is not. The likelihood must represent the probability of the 
data. Z"^ represents the probability of fm{(p), which in general, does not have the 
same probability density as the actual data. Furthermore, it is not at all clear that 
g-4m most suitable prior. If de Jager et al. had claimed to use such a prior, 

they would have had to spend some time justifying their choice. However, under the 
guise of creating an ad hoc statistic, they have surreptitiously inserted that prior into 
the analysis — probably fooling even themselves. 

Is the H-test then worthless, or worse yet, misleading? No. Within the constraints 
of the simulations done to quantify its significance, and within the assumptions that 
it implicitly makes, it is completely valid. However, the likelihood approach offers a 

^if only the were a well-formed likelihood. 
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method which is both simpler to understand, and at least as powerful. The correct 
generalization of the H-test will be discussed in §[4.2.6. 



Fourier Transforms 

The statistical tests described above have been used extensively in analyzing EGRET 
data, but have only had success when a pulsar ephemeris from radio or X-ray ob- 
servations was used to find the photon phases (pi. The Geminga pulsar is the first 
known "radio-quiet" pulsar, with little or no radio emission despite a strong pulsed 
signal in X-ray and 7-ray observations |11, 12CI|| . It is natural to search for other such 



radio-quiet pulsars in the EGRET data. A likelihood method of pulsar searching will 
be discussed in § [4.3| ; here we will briefly touch on a Fourier transform method. 

The traditional method of searching for periodic behavior at an unknown fre- 
quency is the fast Fourier transform (FFT). The main advantage of the FFT is its 
speed. Indeed, such an algorithm has been implemented to search for periodic sig- 



nals from pulsar candidates [|116|] . It has been shown to be successful in detecting 
Geminga — although as we will later see, the signal from Geminga is so strong that 
this is not a definitive test. The drawbacks to the FFT are also significant. First 
of all, the FFT cannot detect a period derivative. Trial period derivatives must be 
removed from the photon phases, and a new FFT performed for each period deriva- 
tive. The FFT also cannot naturally deal with noncontiguous variable exposure and 
backgrounds. In order to detect the weakest pulsar signals, we will need to identify, 
to the extent possible, the photons most likely to be source photons, and account for 
the instrument exposure on a photon-by-photon basis. 

4.2.2 Maximum Likelihood 

The ideal statistical method for pulsed-signal analysis would be sensitive to variations 
in pulsed flux from the candidate source, rather than pulsed count rates in a region 
around the source. Given the success of the maximum likelihood method for spatial 
analysis (§ |2.4D , it seems natural to apply likelihood to the problem of periodic signal 
analysis. Again, we will find that the mathematics may be interpreted in a maximum 
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likelihood framework or a Bayesian framework. Following a suggestion by Gregory 
& Loredo we will generalize the binned method to a maximum likelihood 



method. In § [4.2.6| we will examine the possibility of generalizing an unbinned method. 

The first thing we must do is to construct the likelihood — the probability of ob- 
serving our data, given our model and the state of the instrument. Since the state 
of the instrument and the background are different for every photon, we first divide 
our time interval and spatial intervals into sufficiently small subintervals that there 
is either zero or one photon in each subinterval. We then have an array in parameter 
space of volume elements At Ai Ab. The probability that a given element has no 
photons in it will be the Poisson probability of detecting zero photons during a time 
At from a direction element Ai Ab with rate r{t,i,b), which depends in general on 
time and observed position. The probability of detecting one photon follows similarly: 

Pi{t,i,b) = r(t,£,6) AtAM6e-"(*'^''')^*^'^^'' (4.11) 

We identify each element in t, i, and b with a sequential, unique label. Furthermore, 
let us suppose that of these parameter-space volume elements contain one photon, 
and Q of them contain no photons. N+Q is then the total number of parameter-space 
volume elements, and is the total number of photons. We may find the likelihood 
of this configuration by multiplying the probabilities of each element: 

>C = n PiiUAM n Po{tjJj,b,) (4.12) 

where is the set of Q volume elements with zero photons, and ai is the set of 
volume elements with one photon. Plugging in our expressions from equation (|4.11| ), 
we may factor out the exponential and arrive at 



C = (AtAiAb)^ n r{ti,e,,bi)exp 



N+Q 

- E r{tj,ij,bj)AtAiAb 



(4.13) 



The sum in the exponential is over all volume elements. It is simpler to separate this 
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sum into three independent summations, one for each parameter. In the hmit that 
At, Ai, and Ab are very small, these sums become integrals. Now, the i are just 
arbitrary labels. We renumber the volume elements so that elements through N 
contain one photon, and we have 



TV" 

C = {AtAiAbfp^r{ti,ii,bi)exp - jj^ Jj{t,i,b) dtdidb 



(4.14) 



Note that nowhere have we assumed that the At are contiguous. Thus, this likelihood 
expression is valid even if there are gaps in the data. 

The rate r(ti, ii, bi) is a function of both the source strength and background. We 
assume that the background is constant in time, and varies in space according to the 



standard EGRET gas map [|10|, |75[. The positional dependence of the source rate will 
be taken from the instrument point-spread function. We will thus take the rate to be 



(4.15) 



where / is the time- dependent model source flux as before, A{ti,ii,bi) represents the 
instrument response function, and B{ti,ii,bi) is the background flux. We will worry 
about the detailed forms of A and B below; however, we note here that the time 
dependence of A and B is a. result of the time dependence of the instrument sensitive 
area. 

To simphfy the products and exponents, we take the logarithm of the likelihood: 



N 



ln£ = Nln{AtAiAb) +J2HAfiti) + B^ 



i=l 



{Aj{ti) + Bi)didbdt (4.16) 



where Ai and Bi are shorthand for A{ti,ii,bi) and B{ti,ii,bi). 

Now, eventually we will either maximize the likelihood, or find a ratio of likeli- 
hoods. Thus, constant additive terms in In C may be dropped. We drop the first term 
and separate the integral to get 
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N 

ln£ = Y.HAf{ti)+Bi)~ 

i=l 

[ [ [ A{t,e,b)f{t)didbdt- [ [ [ B{t,i,b)dtdidb (4.17) 

Ji Jb Jt Jt J I Jb 

Although the last term is (implicitly) a function of source position, it will be constant 
for our purposes, as we take source position to be given. Since it is constant, we drop 
it. In the second term, we may split A into PSD(£, 6)SA(t), the point-spread density 
and the sensitive area. The point-spread function is normalized, by construction, for 
all times, and SA(t) depends only on the source position, not the photon positions. 
Therefore we may split the integral into 

-/ f A'{£,b)d£db [ f{t)SA{t)dt (4.18) 
Je Jb Jt 

The spatial integrals are normalized, and the resulting likelihood function is given by 

\nC = Y^HAfiU) + B,)- f fit)SAit)dt (4.19) 
1=1 -'^ 

Remember that / is the model source flux. It is this model that we will vary to 
find the model most likely to have produced our data. The data itself appears only 
implicitly in equation ( [4. 191) , in the Ai and Bi, and most importantly in the ti at 
which the model flux is sampled. 

4.2.3 Application to EGRET 

We have not yet selected a functional form for /. It is tempting to follow the lead 
of the statistic, and let / be a function composed of some number of harmonics, 
perhaps with parameters to be determined. We will delay such an attempt to § [1.2.6| , 
and first develop a method based on which we will find to be less computationally 
demanding. 

The functional form of / that corresponds to binning is stepwise constant, with 
m different steps. The immediate benefit of such a choice comes in the simplification 
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of the second term of equation ([4.19|) : 



//(t)SA(t)rft— >£/,r, (4.20) 
where tj is the sensitive area integrated over the hvetime in bin j; that is, the total 



exposure in bin j. Now equation ( [4.19| ) may be stated as a sum over bins 

(4.21) 



ln£ = E 



iSbin j 



Thus the maximum of ln£ may be found by independently maximizing the fj for 
each bin, instead of simultaneously maximizing all the parameters of / — an enormous 
computational savings. 

The analysis is thereafter quite simple, and quite slow. A candidate pulsar is 
selected for analysis. All photons within some radius, usually 15°, are collected. Each 
photon with sufficient energy (>100 MeV) which arrives from a position sufficiently 
far from the Earth's limb (<95° from the instrument axis) during a valid instrument 
mode is accepted for analysis. For each of these photons, the arrival time is corrected 
to the Solar System barycenter, and the phase-independent likelihood factors Ai and 
Bi are calculated. For periods less than a minute, it may reasonably be assumed that 
the instrument exposure will be evenly distributed across the bins. For longer periods, 
such as X-ray binary orbital periods, all instrument exposure during the observation 
is calculated as a function of SSB time. 

To ameliorate the primary objection to binning, namely, that the arbitrary place- 
ment of the bin boundaries may obscure a signal, we may allow an offset to the 
bin boundary. In practice, this means rebinning the data for a number of different 
boundary offsets. Of course, such an operation increases the number of trials. 



Sampling Density 

A subset of parameter space is identified to be searched for pulsed signals. Some 
boundaries of the range of period and period derivative are set by the physics of the 
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system being examined; those will be motivated in §0 and §4^. The sampling size 
required is determined by the statistics. To determine the number of steps required 
to sample the period space, we consider the difference in phase of the last photon in 
the observation as a function of the change in trial periods. If two trial periods differ 
by so little that the change in phase of the last photon as computed by equation ( |4.7| ) 
is less than the width of a bin, then the most likely /^'s computed by maximizing 
equation ( [4. 211 ) will be unchanged. So, we choose the minimum spacing so that 



the phase of the last photon will change by an amount of order the bin width, and 
calculate the number of steps to take in period across the desired range. 

Np = {Vmax - I'min) m St (4.22) 

where St is the length of the observation. Note that a longer observation means more 
photons will be observed, increasing the signal-to-noise ratio, but the period space 
must be searched more densely. 

Similarly, we can find the minimum period derivative we expect could make a 
difference. In this case, our condition is that (j){St) — uSt is an appreciable fraction 
of a bin. Neglecting higher derivatives, this happens when ^uSt ^ p/m. Since the 
frequency derivative z> = p/p^, we have 

Pmin = ..,.2 (4-23) 

The maximum period derivative must be determined by the physics. 

With the range of period derivatives for a given period in hand, we may compute 
the number of samplings in period derivative required. Again, we find the step size 
by equating the difference in computed phase with the bin size: 

\i>{Stf - ^ (z> + Si>) {Stf = l/m (4.24) 

and thus the number of steps required in the period derivative: 

= ^{i^max - i^min) m (St)'^ (4.25) 



CHAPTER 4. PERIODIC TIME-SERIES ANALYSIS 



81 



Significance Estimates 

For each point in parameter space, the flux in each bin is estimated by maximizing 
the likelihood. Since the sampling of parameter space was chosen to make each point 
roughly independent, the number of points sampled in parameter space should be 
a good estimate of the number of independent trials. The signiflcance is then the 
probability of measuring no likelihoods greater than or equal to that observed in the 
number of trials. 



Automated analysis with timevar 

These methods have been automated into a tool for searching for periodicity in 
EGRET data called timevar Capable of running either interactively or in 

batch mode, timevar searches a given region of parameter space, for a given number 
of bins. It maximizes the likelihood for each period and period derivative, and reports 
the signiflcance of the most likely period and period derivative. In addition, it yields a 
light curve, both as the most likely flux in each bin, and for comparison with PULSAR, 
the number of photons within the 68% containment radius in each bin. 

To demonstrate the capabilities of timevar, we examine the Geminga pulsar. We 
will check viewing period 413.0, corresponding to observations between 7 March 1995 
16:44:00 and 21 March 95 14:08:39. A small range around the known period was 



searched, and the results are presented in Figure [4.1| . The light curves are shown in 
Figure |4.2| . The probability of the observation in the null model is less than 10~^^. 

The great advantage of timevar over previous analysis methods is that it combines 
period and period derivative searching ability with a proper likelihood treatment of 
the source and background as perceived through the instrument point-spread function 
and sensitive area. PULSAR has been useful to examine known pulsars with radio 
ephemerides, although it does not optimally use the point-spread and background 
information about each photon. The disadvantage to timevar is its computational 
expense. Examining a single pulsar candidate over a reasonable range of period and 
period derivatives currently takes on the order of months of continuous running on a 



Sun Microsystems Sparc 10. Mattox et al. |116| have sidestepped this problem flrst by 
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Figure 4.1: The likelihood of periodic modulation from Geming function of 

period, for a 30-bin light curve. Plotted on the y-axis is 2 ln£, which is distributed as 
X29, since the average intensity in the null model is a free parameter. The probability 
of the peak in the null model is less than 10~^^. The maximum likelihood value of 
the period is 0.237097785 s, assuming a period derivative of 1.09744 x 10~^^ s/s and 
a truncated Julian day epoch 8750.0. 

turning to a Fourier algorithm — which is much faster, but cannot properly account for 
backgrounds and point-spread functions — and second by using a massively parallel 
computer. Further optimization may be possible to speed timevar, and certainly 
additional computing power will reduce the required search time. A similar program 
designed for GLAST will be slowed by the additional order of magnitude in the 
number of photons to process, but since the sensitive area will be larger, the elapsed 
time required to observe a sufficient number of photons will be shorter, allowing period 
space to be sampled less densely. In addition, faster computers will be available to 
do the calculations. The net effect will be that period searching with the GLAST 
equivalent to timevar will be quite feasible (Table [4.4|) . 
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Figure 4.2: Light curves of the most hkely period as found in Figure [4 .11 . The hght 
curve on the left is the most hkely flux in each bin, after accounting for the instrument 
point-spread function and sensitive area and the background. The light curve on 
the right is the raw photon count inside the 68% containment radius, as given by 
equation (|1.1|). This is the same as the light curve used by PULSAR. 



4.2.4 Bayesian Inference 

A further advantage of the likelihood method employed by timevar is the ease by 
which results can be evaluated under a Bayesian framework. Strictly speaking, the 



plot shown in Figure ETI is meaningless in a maximum likelihood framework — only 
the maximum value, and not the entire likelihood function, is relevant. However, 
intuitively we expect the rest of the function to have some meaning. Local maxima 
should represent frequencies more likely to be present in the data. We would expect 
harmonics of the main pulse frequency to be present. As we saw in § ^.6| , the Bayesian 
formulation includes all this information into the posterior probability distribution. 
We simply multiply the likelihood function given in Figure [4.1| by the prior proba- 
bility distribution. Then we may integrate the posterior distribution over interesting 
intervals, or simply present it as it is. 



Appropriate Priors 

The least-informative prior probability distribution for the pulsation period is scale- 
invariant. That is, it should not depend on the units used for measuring period, 
and it should be the same whether the search is done in period or frequency space. 
The appropriate solution is fiat in the logarithm of the period. Therefore, we have 
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-Pprior(p) oc 1/p. The upper and lower bounds of the period search define the cutoff, 
so that the prior can be normahzed: 

Ppno.(p)= fln^) - (4.26) 

The least informative prior for the period derivative is uniform. The period deriva- 
tive is a dimensionless number, and thus is the most natural unit in which to work. 
The last parameter of interest is the phase offset of the bin boundaries. This is also 
dimensionless, and thus the least informative prior is flat. 

Gregory & Loredo |^ point out that marginalization over m yields the best light 



curve independent of the number of bins. This is exactly the spirit of the H-test. 
However, this is computationally impossible for a period search given the present state 
of the art. It is interesting to note here that the proper treatment of this problem is 
very difficult using frequentist statistics. Two models with different m have a different 
number of free parameters. The Bayesian framework takes care of this naturally; each 
free parameter (that is, each of the m fluxes) has its own prior; thus the complete 
prior for all the fluxes (assuming a flat prior between zero and the large cutoff) will 
be Pprior(/m) = fmTx- This prior is a quantization of Ockham's Razor: models with 
more free parameters (that is, larger m) are discouraged. The frequentist method 
compares each model to the null model, but cannot directly compare the two. 

Figure |4.1j is actually one of three different likelihood functions measured for dif- 
ferent phase offsets. It is the slice of the likelihood corresponding to the most likely 
offset. The perpendicular slice through the likelihood as a function of offset flxed at 



the best period is shown in Figure 4.3. The phase offset is a classic nuisance parame- 



ter; it arose by the arbitrary choice of to iii equation ( |4.71) . We therefore marginalize 



over the offset and arrive at the likelihood function shown in Figure O. 

To flnd the overall signiflcance of a detection]^ in the Bayesian framework, we 
multiply the likelihood function shown in Figure ^]4| by the period prior given in 



equation ( [4.26| ). The ratio of this probability with the probability of the null model 

"''In our case, this will be the overall significance of a detection for all models with m bins. In 
principle, however, the parameter m could be marginalized over as well, yielding the true overall 
detection significance. 
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Figure 4.3: The likelihood of pulsation in Geminga as a function of the phase offset 
of the bin boundaries, for fixed period. The period chosen is that of the maximum 
likelihood for any offset, and corresponds to the peak seen in Figure [4.1| . The width 
of the plot corresponds to one bin; therefore, the leftmost point and the rightmost 
point are the same. 



(m=l) gives the odds ratio. In this case, the probability of periodic modulation is 
completely dominated by the peak, and the significance of the detection is 1—7 x 10^"^^. 

The Geminga pulsar is clearly detected by either method. The detections are 
so strong that even if a very large region of phase space had been searched, the 
detections would still be significant. Finding the significance that would have resulted 
from such a search may be calculated by assuming that the peak we found would be 
the maximum likelihood over the entire search, and that the number of trials would 
be the number of steps found in § 4.2.3| . A thorough search of viewing period 413.0 
from 50 to 500 fis, over the appropriate period derivative ranges would require 



approximately 7 x 10 trials. The significance of the detection in Figure ETI would 
still be 1 — 10~^°. However, it is clear that weaker detections will not survive such a 
harsh attack. It is therefore also useful to develop some formalism for upper limits 
and detection thresholds. 



4.2.5 Upper Limits and Thresholds 

The problem of upper limits was briefly touched upon in § [^.7| . There we noted that a 
desirable quality of an upper limit is that it be statistically well-behaved; that is, that 
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Figure 4.4: The likelihood of pulsation in Geminga. This likelihood function is 



similar to that shown in Figure [4.1| , except that the likelihood has been marginalized 
over the phase offset. 



a "1(7 upper limit" be greater than the true value in 68% of the ensemble of possible 
data sets. The value of this upper limit depends on the details of the data. 

In contrast, in some cases it may be more useful to calculate a threshold. We 
define a threshold to be the intensity (of a source, or of pulsation) that would be 
detected by our criteria in some fraction of the possible data that might be measured 
from that source. This may be quantified: an x% detection threshold is that true 
parameter value for which x% of possible data sets would result in detection. We make 
a statement about the outcome of a hypothesis test (is there pulsation?) instead of a 
statement about a point-estimate (is the pulsed fraction less than some value?). 

There are three advantages to this. First of all, it more closely reflects the intu- 
itive "upper limit" concept — that is, it is now reasonable to say, "The threshold for 
detection is /. Since we do not detect, then the actual value is probably less than /." 
We will quantify this below. Second, we may calculate thresholds from the underlying 
distributions of the data; upper limits can only be calculated with the specific data 
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observed. This means that thresholds can be precalculated. This is a great advan- 
tage, especially for instruments which are still being designed or built. Finally, since 
we can use distributions, we can calculated thresholds (at least partially) analytically. 

For the purposes of pulsation analysis, we will define the "pulsed fraction thresh- 
old" as the fraction of the flux (background plus source) that must pulse such that 
a fraction C, of data sets observed with true sinusoidal variation with threshold am- 
plitude will result in detection. We will assume that any data set with significance 
greater than a will be a detection and the variation will be sinusoidal. 

For simplicity, we define C = 2 ln£. We will take ^ = 0.5, so that we may find the 
a-point of the integral distribution. That is, we seek C such that /q*"^ X^('^, t) dx = 
a, where d is the number of degrees of freedom, and t is the number of trials.| This 
follows from Wilks' theorem (§ |2.5| , ||211[| ), which states that C is distributed under 



the null model as with the number of degrees of freedom equal to the number of 
free parameters in the model. To find the mean value of C, we integrate over all data 
sets, weighted by their true probability: 

(C) = I CT{D)C{D)dD 

JD 

= I CT{D)2[\nCBF{D) -\nCBN{D)]dD (4.27) 

JD 

where Ct{D) is the true likelihood of data D, Cbf{D) is the likelihood of the best 
model fit to data D, and Cbn{D) is the likelihood of the best null model fit to data 
D. 

To proceed, we turn once again to Wilks' theorem. If the null model is true, then 
from equation ( |2.28|) we have 2{\n Cbf — ln>C„„/;) ~ Xd- From the definition of the 
expectation value, we know that 

(A(D))= I P(D)A(D)dD (4.28) 

JD 

for any function A{D). The expectation value of Xd is d and the probability of 

^Assuming that the mean of C is a good approximation to the median, so that roughly half of 
the realized values of C are greater than (C), and half are less. 
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the data is the hkehhood of the data under the (unknown) true model Ct{D), so 

(2 [\nCBF{D) - \nCT{D)]) = d (4.29) 
Therefore, we conclude that 

/ Ct{D)2 \\nCBF{D) - \nCT{D)] dD = d (4.30) 

JD 



Substituting into equation ([4.27|) , we have 



{C) = 2[ Ct( D) In Ct( D)dD + d-2 [ Ct( D) In Cbn (D) dD (4.31) 

JD JD 

We now make some further assumptions about the pulsation. As before, we will 
let fm{4>) to be the model flux as a function of with m parameters. In particular, 
we will define the null model fi{4>) = fi, with one parameter, the mean flux. We will 
further assume that the number of photons in each bin is large enough to approximate 
the Poisson distribution as a Gaussian (that is, Ui is large). Then in each bin i we 
may write 

Zi = 7 ^"^^'^^^ (4.32) 



where is the number of photons in bin i, and fm{<Pi) is the model number of photons 
in that bin. Then Ct{D) = Hi '^^/^ ■ For each bin, the first term of equation ( [4.31| ) 
becomes 

/ Ct( D) In CT(D)dD = ^e""?/' | - In ) c/^i 

JD, J-oo V27r V 2 / 

= -lnV2^-l/2 (4.33) 



This does not depend on the model fm since for each bin, the best fit model should 
describe the data in approximately the same way. That is to say, the statistical 
deviation from the best fit model should not depend on the details of that model. 
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Using this result, equation ( [4 .311) becomes 



(C) = -2m In -m + d- f CAD^lnCBNiD) dD 

JD 



(4.34) 



The last term contains the best fit null model. This must be related to the true null 
model in the same manner as equation (|4.29|) . 



J^CT{D)\nCBN{D)dD = J^CT{D)[\nCTN{D) + 1/2] dD 

= 1/2+ / CT(D)\nCTN(D)dD (4.35) 

JD 



Again assuming Gaussian fluctuations, Ctn{D) = Hi ^=exp 



h 



yielding 



-2m In — m + d — m + 2m In \/2ti 



{JDi y/2'K 



(4.36) 



The number of degrees of freedom is m — 1. To facilitate the integral, we write rij 
in terms of Zi. The integral over the data Di is properly normalized as an integral 
over Zi. We assume that rii is large enough that expanding the bounds of integration 
from — oo to oo will introduce only a small error. 



{C) 



-m 



1 + E 



i=l 



'2ti 



^i\J fm ~l" fm fl 



dzi 



(4.37) 



The threshold is then given implicitly as a function of the parameters of / through 
Zi, as the value of the parameters such that /J*"^ x^{d-,t) = ot. 

For example, one form of / might be a sinusoidal modulation plus a constant 
background: /(0) = B + y4sin27r0j. The average value over is B, so /i = B. 
We may now ask the question: for a given background S, what is the threshold for 
detecting a modulation of strength A < Bl Comparisons of the threshold with the 
modulation expected on physical grounds can put null experiments into perspective 



]J, |§3[)' well as help identify potential candidate sources for pulsation detection. 
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4.2.6 Bin- free Maximum Likelihood 

The application of maximum likelihood methods to pulsation searches binned photons 
to simplify and speed calculations. It is important to remember that this binning is 
not an essential element of likelihood methods. Recall from equation ([4.19| ) that the 



likelihood function depends on a generic model function /. We may construct the 
maximum likelihood analogue of the by taking 

m 

fm{4>) = c^o + 51 sin 27rA;0 + cos 2iTk(j)) (4.38) 
fc=i 

This function actually has 2m + 1 degrees of freedom. Given this, we can calcu- 
late equation ( [4.19|) , and maximize it with respect to the ak and jSk- In contrast 
with the binned case, we must now maximize the likelihood simultaneously for the 
model parameters. Instead of m one-dimensional maximizations, we must perform 
one (2m + l)-dimensional maximization]^ Since the Ai and the Bi are taken from nu- 
meric tables and the functions involved are rather complex, it is unlikely that it will be 
possible to maximize the functions analjd;ically. Numerical methods for multidimen- 



sional maximizations are notoriously slow and prone to finding local minima ||166 
Nevertheless, all the principles discussed above remain valid. The analogue to the 
H-test would be found by marginalizing over m with a suitable prior. 

Sinusoids are familiar, and so are an obvious choice of basis. However, any set of 
basis functions may be used in a parameterization of /. Generalizing in a different 
way, we may imagine k bins with arbitrary boundaries, so that the bins may not be 
the same size. This would be represented by a stepwise function 

/(0) = /., J;/ (4-39) 

i J e [0, k) 

where kj indicates the beginning of bin j, fj is the flux in bin j, and there are k bins. 



^Reducing the number of Fourier components retained would not help this problem. Optimization 
with n Fourier components still requires an n-dimensional maximization, as opposed to n one- 
dimensional maximizations in the binned case. 
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Again, however, if the kj are allowed to vary, the likelihood function is no longer 
separable, and all the model parameters must be simultaneously optimized. 



4.3 Searching for Pulsars 

Now that we have a statistical framework in which to search for pulsed flux in EGRET 
data, it is natural to use it to examine unidentified sources. A number of unidentified 
EGRET sources are positionally coincident (within large error regions) with known 
radio pulsars |^3|, |116| , |136|| . Other EGRET sources are concentrated in the Galactic 



plane. Since there are no other known strong Galactic 7-ray sources, and the Geminga 
pulsar is known to be radio-quiet, it is widely assumed ||141| , |163| , |126| , |116|| that some 



or all of these sources are pulsars. It may be possible to detect pulsation from these 
sources without the use of ephemerides from other wavelengths. 

One of the difficulties of such an analysis is the choice of the range of parameter 
space to search. The fastest millisecond pulsar has a period of 1.5 ms, and the slowest 
pulsars have periods of several seconds. The longest period pulsar observed in 7-rays 
is Geminga, with a period of about 237 ms. A reasonable search strategy would cover 
as much of this space as possible. The range of period derivative to search depends 
strongly on the period being searched. Equation ( |4.23|) gives the minimum period 
derivative. Observed pulsars fall below an empirically determined cut-off in period 
derivative (Figure ^]^). All known 7-ray pulsars have period derivatives smaller than 
the cutoff. The known pulsar population occupies a relatively compact area on a 
plot of period versus period derivative, allowing the appropriate regions of parameter 
space to be searched for periodic signals. 

Unfortunately, according to equation ( ^.23| ), the minimum period derivative to 
search decreases with decreasing period, and according to Figure [4.5| , the maxi- 
mum period derivative increases with decreasing period. Therefore, as we search 
shorter and shorter periods, the number of period derivative steps required increases 
rapidly. However, for periods larger than that where the sampling limit given by equa- 
tion ( 4.23| ) crosses the period-derivative cut-off, we may ignore the period derivative. 



The time required to search a given region of parameter space thus depends weakly 
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Pulsar Period (s) 



Figure 4.5: Periods vs. period derivatives for pulsars in the Princeton database [ 187 |. 
The sohd hne is the high period-derivative cut-off hne; almost all pulsars lie below 
and to the left of this line. The dotted line is the sampling limit; below it, period 
derivatives may be neglected for pulsar searches. This limit depends inversely on the 
square of the length of the observation; 10 days is assumed here. The dashed line 
corresponds to a characteristic age of 1000 years; no younger pulsars are expected. 



on the maximum period to be searched, and strongly on the minimum period to 
be searched. A minimum period of 50 ms and a maximum period of 500 ms would 
contain all of the known 7-ray pulsars except the Crab. This range of period, and 
the appropriate range of period derivative for each period, is taken as the canonical 
search range. 



4.3.1 Measurement of Known Pulsars 

To verify that timevar is both statistically accurate and bug- free, several known 
pulsars were examined as if they had been the subject of a search. The results of 



those analyses are summarized in Table 4.2. Each known 7-ray pulsar was examined 



in one viewing period, chosen for good exposure to the source. A periodicity search 
was carried out for a small range around the known period, and the significance of 
the detection is given as if a single period had been tried. To find the total detection 
probability, the number of trials is found by calculating the number of parameter space 



CHAPTER 4. PERIODIC TIME-SERIES ANALYSIS 



93 



Pulsar 


VP 


True Period 


timevar 


One-trial 


Trials 


Final timevar 






(ms) 


period 


probability 


Needed^ 


significance 


Geminga 


310.0 


237.097785 


237.097784 


4.8 X lO-''^ 


5.5 X 10*^ 


1 -2.6 X 10-^4 


Crab 


001.0 


33.38981606 


33.38981541 


9.5 X 10-"6 


5.1 X 1016R 


1 -4.8 X 10-*29 


Vela 


301.0 


89.29607605 


89.2960604808 


6.7 X 10-^6 


8.7 X 10*2 


1 - 5.8 X IQ-^^ 


B 1706-44 


232.0 


102.4544637 


102.24544626 


2.6 X IQ-^^ 


8.8 X 10*2 


0.7712 


B1055-52 


014.0 


197.1102023 


197.110146607 


4.3 X 10-* 


8.4 X 10*2 





1951+32 


203.0 


39.530908008 


39.5308978685 


2.0 X 10"* 


1.7 X 10*^ ^ 






Searching a period range from 50 ms to 500 ms 
^Searching from 10 ms to 500 ms 

Table 4.2: Detection of known 7-ray pulsars by timevar. The one-trial probability 
is the chance of observing the maximum likelihood value in a single trial. The number 
of trials is determined from the required sampling (§ f4.2.3| ) to search a period range 
from 50 ms to 500 ms, with the period derivative range appropriate for each period. 
For the Crab pulsar and PSR 1951-1-32 the search range is 10 ms to 500 ms. The 
total significance is the chance that no such likelihood would be observed in the given 
number of trials under the null model. 



samplings required for the canonical search range for that observation. Unfortunately, 
we will find that this range is too ambitious to search with current technology. 

The three brightest steady-state EGRET sources are Vela, Geminga, and the 
Crab. All three of these objects would have been easily discovered by timevar in a 
search of EGRET data with no prior information about their periods. Their signif- 
icance is definitive; there would be no doubt about the validity of the detection. A 



light curve from Geminga is shown in Figure |4.2| . A plot of likelihood versus period 
for the Crab is given in Figure [4 .61 , and the light curves associated with the most 
likely period are given in Figure [4.7| . The likelihood peak in Figure |4.6| is very sharp. 
Note, however, that the probabilities given in Table |4.2| are taken only from the peak 
value of the likelihood. The Bayesian significance would be found by multiplying the 
likelihood function by the prior, and integrating over the width of the peak.|U] In this 
case, the choice of prior is largely irrelevant, since the likelihood is so sharply peaked. 



Figure [4.7| shows the maximum likelihood light curve as the flux measured in each 
bin, as well as the raw photon count within the energy-dependent 68% containment 
radius as given by equation ( |1.1|) for the same pulsar period. 



"In principle, one would integrate over the entire range. Since Figure 4.6 represents ln£, the 
contribution of all points not in the peak is negligible. 
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-10.170 -10.160 
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Figure 4.6: Twice the likelihood (C) of flux modulation from Crab as a function of 
trial period for a 10 bin light curve. C is distributed as Xg the null model. The 
maximum likelihood period is 33.38981541 ms. 



Figure 4.7: Light curves of Crab at the maximum likelihood period. The left light 
curve is the maximum likelihood flux in each bin. The right light curve is the raw 
photon count within the energy-dependent 68% containment radius. 



CHAPTER 4. PERIODIC TIME-SERIES ANALYSIS 



95 




10 -4.00 -5.90 -5.80 
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Figure 4.8: Twice the likelihood of flux modulation in Vela as a function of trial 
period for 10 bin light curves. The maximum likelihood period was 89.2960605 ms. 



The likelihood peak for Vela in Figure |4.8| is not nearly as sharp as that for 
the Crab. The viewing period analyzed began on 22 August 1991 and ended on 
5 September 1991, soon after a pulsar glitch on 20 July 1991 It is likely that Vela's 
period was not very constant during this observation, causing the broad likelihood 
peak observed. 

In addition, PSR 1706-44 would have been detected with a significance of ~77% 
(Figure [4.10| and |4.11| ). While this is not high enough to conclude that the emission 
is truly pulsed, it is based on the data from a single viewing period. Such detections 
should be rare enough that a small search in a different viewing period would be 
feasible to confirm the detection. 

The remaining two known 7-ray pulsars, 1055-52 and 1951+32, were not detected 
significantly. 



4.3.2 Searches for Geminga-like Pulsars 

Of course, given the presence of Geminga, largely invisible to radio observations, as 
well as the large number of unidentified EGRET sources near the plane, it is natural to 
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Figure 4.9: Light curves of Vela at the maximum hkehhood period. The left hght 
curve is the maximum likelihood flux in each bin. The right light curve is the raw 
photon count within the energy-dependent 68% containment radius. 



i^i^^wwi' Wffwi 

54.4 54.6 54.8 55.0 

Period - 102400. microseconds 



Figure 4.10: Twice the hkelihood (C) of flux modulation from PSR 1706-44 as a 
function of trial period for a 10 bin light curve. The maximum likelihood period was 
102.24544626 ms 
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Figure 4.11: Light curves of PSR 1706-44 at the maximum hkehhood period. The 
left hght curve is the maximum hkehhood flux in each bin. The right light curve is 
the raw photon count within the energy-dependent 68% containment radius. 



suspect that some of these sources are Geminga-like pulsars (e.g., ||141| , |126| , |163|| ). To 
begin the gargantuan task of examining the unidentified EGRET sources, we separate 
those which are likely to be good pulsar candidates. The criteria for determining good 
candidates are steady emission, a hard spectrum, and close proximity to the Galactic 
plane ||, [lT6|. 

One promising candidate is 2CG075-I-00, which is also known as GRO J2019+3719, 
and hereafter will be referred to as 2CG075. It lies very close (<0.5°) to the galactic 
plane, and has a very hard photon spectrum {a = —1.40 ± 0.14) with a break occur- 
ring at about 1 GeV. Similar behavior is seen in Vela, Geminga, PSR 1706-44, and 



perhaps PSR 1055-52 There are no known radio pulsars inside EGRET's 95% 
error contours of 2CG075. 

The brightest unidentified 7-ray source is known as GRO J1745-28, and is posi- 
tionally coincident with the Galactic center. While it is unlikely that the Galactic 
center is occupied by a pulsar, it is quite possible that a pulsar may be nearby. In 
addition, a significant spectral break has made it a popular choice as a pulsar candi- 
date 11116 . 



The EGRET error box on 2EGS J1418-6049 is smaller than the field of view 
of one of the X-ray cameras on the ASCA satellite. Analysis of the ASGA data 
in the EGRET error box has identified a potential X-ray counterpart ||1 70|| . The 
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Source 


VP 






-^TH,trials 


(/) 


Detection Possible? 


Geminga 


310.0 


443 


66 


131 


562 


Yes 


Vela 


8.0 


1937.18 


138 


277 


3699 


Yes 


Crab 


310.0 


1638 


127 


279 


218 


Possible 


1706-44 


232.0 


2162 


146 


293 


170 


Unlikely 


2CG075 


203.0 


2736 


184 


367 


314 


Possible 


J1745-28 


5.0 


2134 


162 


377 


158 


Unlikely 


J1418-61 


314.0 


1507 


136 


316 


184 


Unlikely 



Table 4.3: Thresholds for pulsed sources. For each source, a viewing period is 
selected for observation. (B) is the expected number of background photons within 
5?85 of the source. Jth is the minimum number of source counts needed to detect 
pulsation in a single trial, and /th, trials is the number of counts needed to detect 
pulsation in a period search. (/) is the expected number of source counts, based on 
the source flux and exposure in the viewing period. 



EGRET data was searched for periodicity over a small region of parameter space in 
coordination with the X-ray effort. 

Before too much computer time is spent examining candidate pulsars for fluctua- 
tion, it is worthwhile to consider the thresholds for pulsation detection. We can find 
the threshold for pulsed signal detection, assuming that all of the source flux is pulsed, 
and using the background in the direction of the source. The thresholds and actual 
flux levels for some sources of interest are given in Table [4.3| . In accordance with what 
we have already seen. Vela, Geminga, and Crab significantly exceed their threshold 
flux. PSR 1706-44 is somewhat below threshold; in fact, in the absence of indepen- 
dent data to confirm our detection, we would not have considered PSR 1706-44 to be 
detected in a 7-ray period search. 

Another useful application of threshold analysis is the estimation of the number of 
radio-quiet, Geminga-like pulsars that may be observed by GLAST. There have been 
many estimates of the Galactic population of such sources, based on the predictions 



of various pulsar models [pl5| , |141| , |126| , pl6|| . Here we will estimate the instrumental 
threshold. As we have seen, the threshold depends on the amount of exposure to the 
source. Assuming we do not already know the pulse period, then a longer exposure 
means that we must search period space more densely (§ [4.2.3| ). This requires more 
trials, and thus, more computation time. An example is shown in Table [4.4|. The 
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Figure 4.12: E/D"^ vs. pulsar period. Pulsars most likely to be identified 
have the highest apparent luminosities. The EGRET threshold is approximately 
5 X 10-" pl|] . 

EGRET gas map is integrated over a hypothetical GLAST 68% containment radius 
of 1?5 to obtain a background flux (1.9 x 10^^ cm~^ s^^)- The detection threshold 
for a period search from 50 ms to 500 ms is given. The threshold continues to decline 
for longer exposures, while the number of trials required increases. The threshold 
includes the statistical effects of the number of trials, so the only limiting factor is 
computation time. 

While the exact value of the detection threshold will depend on the details of 
the structured background at the point of interest, and the details of the instrument 
exposure, we may take 1.5 x 10'' to be a reasonable threshold for the detection of 



pulsation in radio-quiet pulsars in a period search. Yadigaroglu & Romani ||216|| list 
35 unidentified EGRET sources near the plane. Thirty of these have fluxes well above 
our threshold; three more are close to the threshold, and three are below it. While 
GLAST will detect many more sources than EGRET, it is unlikely to detect many 
new bright sources. It seems likely, then, that GLAST will be capable of definitively 
searching 25-30 radio-quiet pulsar candidates, either detecting pulsation or setting 
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Elapsed Time 


Exposure 


Threshold 


Trials 


Threshold 


(Days) 


(10^ cm2 s) 


(known period) 




(period search) 


1.7 


1.5 


11.1 


1.7 X 10^" 


20.7 


3.4 


3.0 


7.9 


1.3 X 10^^ 


15.0 


5.1 


4.5 


6.4 


4.3 X 10^1 


12.4 


6.8 


6.0 


5.6 


1.0 X 10^2 


10.9 



Table 4.4: GLAST thresholds in units of 10~^ cm^^ s^^ for the detection of pulsed 
sources near the Galactic plane for known ephemerides, and for period searches. The 
number of trials is that required to search from 50 ms to 500 ms, with appropriate 
period derivatives. 

stringent upper limits, effectively eliminating pulsars as an identification of these 
sources. 

4.3.3 Results 

Searching EGRET data for unknown-period pulsations is a very slow process. 2CG075 
was analyzed with an array of approximately five Sparc 5 and Sparc 10 processors. 
Part of one viewing period (203.0) was searched from 137.8 ms to 500 ms in period 
and the requisite range of period derivative (§ |4.2.3| ). The analysis required approxi- 
mately three months of continuous computation. The most significant pulse period 
and period derivative, while unlikely in a single trial (P = 5.6 x 10^'^), was not sig- 
nificant given the number of trials (4.6 x 10*^). To make sure that the detection was a 
statistical anomaly, a different viewing period (2.0) was searched in a very small win- 
dow around the best ephemeris. There was no significant pulsing in the independent 
data. 

J1418-604 was searched for periodic modulation in viewing period 314.0. No 
significant pulsation was seen. J1745-28 was searched in viewing period 5.0. While a 
very low significance detection was made, examination of the candidate pulse period 
in viewing period 223.0 revealed no significant modulation. 
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Candidate VP Period Range Best Period Significance 



2CG075 203.0 
J1418-604 314.0 
J1745-28 5.0 



137.8-500 ms 
226.5-300 ms 
235.3-400 ms 



140.950161 
271.870713 
277.760942 



1.19 X 10-3^ 
0.55% 
87.7% 



Table 4.5: Results of period search with timevar. The infinitesimal significance of 
the 2CG075 detection stems from that fact that the single-trial significance of the 
most significant period would be expected to be observed about 260 times in the 
number of trials performed. 

4.4 X-Ray Binaries 

X-ray binary systems offer an excellent example of the advantages of coherent pulsa- 
tion analysis with the methods used in timevar. Most orbital periods are between 
tens of minutes and several hours. This is comparable to, or longer than, the typical 
time scale of exposure changes in EGRET; thus any attempt to resolve orbital flux 
modulation must take explicit account of exposure changes. Meanwhile, 7-ray fluxes 
of X-ray binaries are low enough that direct observation of a single period of modu- 
lation is impossible. As with pulsar analyses, epoch folding offers a way to improve 
the signal-to-noise ratio and increase chances for the detection of orbital flux modu- 
lation. However, not only must the arrival times of photons be epoch folded, but also 
the changing exposure must be folded at the same period and phase to find accurate 
fluxes. 

X-ray binaries consist of a neutron star or black hole accreting material from a 
normal star. They constitute the brightest sources in the X-ray sky. X-ray binaries 
are divided into "low-mass" and "high-mass" systems, in reference to the normal 
companion. Low-mass X-ray binaries have an older star as a companion that generates 
little or no stellar wind. In order to emit significant X-ray power, the companion star 
must overflow its Roche lobe. The overflowing material then accretes onto the neutron 
star or black hole. If the system contains a neutron star with a large magnetic field, 
the material will flow along the field lines and accrete onto the polar cap. Otherwise, 
the material will form a thick accretion disk. X-ray emission can come from either 



the polar caps [ p. 3 7)1 or the inner edge of the accretion disk ||179|| . High- mass X-ray 
binaries have O or B stars as companions, which generate substantial stellar winds, 
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depositing 10 ®-10 ^°Mq per year onto the compact object. This wind is sufficient 
to generate the observed X-ray luminosities ||207| . 



Two major features are observed in X-rays over the binary orbit period. The 
first is a simple echpse of the compact object, resulting in a rapid drop in the X-ray 
luminosity. The second is a dip in the X-ray luminosity, presumably caused by the 
obscuration of the compact object by the thick accretion disk ||207|| . Individual X-ray 



binaries may exhibit one or both of these features. The periods of the observed X-ray 
binary orbits range from 11.4 minutes to 398 hours. 

4.4.1 Thresholds and Searches 



The low-mass X-ray binaries listed in Table 4.6 and the high-mass X-ray binaries listed 



in Table [4.7| , taken from the lists in White, Nagase, & Parmar ||207|| , were found to 
have a non-zero average fiux with a significance of greater than la. However, without 
a detection of pulsation at the X-ray period it is not justified to claim an association 
of the 7-ray excesses with the X-ray binaries. Only one source, Cygnus X-3, was 
detected at the 5a level in EGRET data. 

Thresholds for detection of periodicity were calculated using the method described 



in § [4.2.5| for an extensive list of X-ray binary candidate sources. It was assumed that 
the source variation was sinusoidal. For any given values of source and background 
strengths, the detection threshold is a strong function of the duty cycle of the source. 
While it may be hoped that some X-ray binaries have duty cycles as short as 20%, it 
is more likely that 7-ray emission is fairly constant for most of the orbit, then drops 
sharply but briefiy during eclipse. It was further assumed that the number of photons 
in any bin was large enough so that the Poisson distribution is well approximated by 
a Gaussian. This required a rather coarse division into five phase bins. 

X-ray binaries which would be near the EGRET detection threshold if all their 
fiux were modulated at the orbital period were examined with timevar. All photons 
from the source were epoch folded for a small range of periods around the known X- 
ray orbital period. They were assigned likelihood values with regard to the source and 
background strengths. The instrument exposure to the source was also epoch folded. 
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and the exposure for each bin was obtained. The hkehhood of source fluctuation 
could then be calculated as in M.'2.'3. 



4.4.2 Results 



Most X-ray binaries ||207|| have source-to-background ratios so low that even if their 
orbital modulation fraction were 100%, the modulation would be undetectable. It 
would still be possible to detect modulation in such sources if the duty cycle were 
sufficiently short, but under the standard model of X-ray binary emission, this is 
unlikely. For candidate sources that have parameters near threshold, a maximum 
likelihood period search is done. The sinusoidal assumption is dropped in favor of a 
five independent bin light curve model. The choice of five bins was made to maximize 
flexibility in the model while retaining sufficient numbers of photons in each bin. 
Photons are barycenter corrected and epoch folded with trial periods in a small range 
(±10-20%) of the known X-ray orbital periods. In contrast to the pulsar searches, 
the longer periods of X-ray binaries made it necessary to epoch fold the instrument 
exposure as well. While the exposure was fairly evenly distributed among bins in the 
shortest period binaries, it could be quite uneven for longer period X-ray binaries. 
Some sources were observed in more than one viewing period; Fourteen promising 
low-mass X-ray binaries (Table |4.8|) and four promising high-mass X-ray binaries 
(Table [4. 91 ) yielded no periodic signal detections significant at the 99% level. 

Several sources were analyzed despite a low signal-to-noise ratio, in case they 
were to display variation with a very short duty cycle. Only Cyg X-3, which has been 
extensively studied in gamma rays ||105| , |135| , |139|| , was bright enough to have a non- 



negligible chance of being detected, assuming a sinusoidal light curve. No evidence 
for variation was found in any of the sources. 
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Name- 




VP 


Pholons/bin 


Max Modulal 


ion 


Tlir(;sliold 


X0543-682 = 


Cal 83 


329.0 


8 


14. 


4% 


>80% 


X0547-711 = 


Cal 87 


224.0 


13 


33. 


,2% 


75% 






17.0 


57 


13. 


,6% 


45% 


X1124-685 = 


N'Mus 91 


230.0 


22 


14. 


6% 


65% 


X1323-619 = 


4U1323-62 


23.0 


63 


4. 


,7% 


40% 


X1455-314 = 


Cen X-4 


217.0 


14 


15. 


3% 


75% 


X1625-490 




23.0 


94 


7. 


,2% 


35% 






529.5 


76 


9. 


,0% 


40% 


XB 1636-536 




27.0 


135 


1. 


,8% 


30% 


X1656+354 = 


: Her X-1 


9.2 


37 


6. 


,7% 


50% 


XB1658-298 




232.0 


226 


2. 


,0% 


25% 






5.0 


438 


1. 


2% 


~10% 


X1659-487 = 


GX339-04 


336.5 


98 


2. 


,6% 


35% 






270.0 


136 


5. 


,3% 


30% 






226.0 


127 


2. 


1% 


30% 






323.0 


178 


2. 


1% 


27% 






210.0 


31 


4. 


4% 


60% 






214.0 


38 


7. 


,6% 


52% 






5.0 


246 


2. 


,2% 


22% 






219.0 


12 


18. 


,8% 


75% 






302.3 


68 


2. 


,2% 


40% 






423.0 


54 


1. 


,7% 


45% 


X1735-444 




226.0 


183 


0. 


,4% 


27% 


X1755-338 




226.0 


183 


0. 


7% 


27% 






229.0 


18 


4. 


,9% 


68% 


X1820-303 




323.0 


260 


0. 


,4% 


17% 


X1822-371 




508.0 


47 


2. 


,5% 


47% 






529.5 


46 


3. 


1% 


47% 


X1908+005 = 


: Aql x-1 


43.0 


24 


14. 


7% 


60% 


X1957+115 = 


: 4U1957+11 


331.5 


42 


3. 


1% 


50% 


X2023+338 = 


: V404 Cyg 


303.2 


62 


13. 


4% 


42% 


X2127+119 = 


: AC211 


19.0 


46 


1. 


,6% 


47% 



Table 4.6: Low mass X-ray binaries. For each source, the viewing period of the 
observation is hsted, along with the average number of photons in each phase bin, 
the modulation fraction that would be measured if all the source flux were modulated, 
and the threshold modulation fraction that would yield a 99% significance detection 
in half of all possible data sets. 



CHAPTER 4. PERIODIC TIME-SERIES ANALYSIS 



105 



Name VP Photons/bin Max Modulation Threshold 



XU532-DD4 = 


T A /T/^ "V" A 

LMC X-4 


D.U 


A A 

44 


16. 


1% 


50% 






17.0 


52 


13. 


070 


4570 






224. U 


12 


37. 


3% 


80% 


X0538-641 = 


T T\ /r/~1 AT" O 

LML X-3 


6.0 


42 


7. 


97o 


50% 


X(J54U-d97 = 


T A /T/^ AT" 1 

LMC X-1 


17.0 


55 


24. 


0% 


43% 






D.O 


4b 


10. 


9% 


/I '70/ 

47% 


X1119-D03 = 


AT" O 

Len X-3 


14.0 


-\ C A 

164 


3. 


7% 


27%) 






402.5 


31 


18. 


6% 


cr 70/ 

57% 






402.0 


27 


19. 


3% 


cr 70/ 

57% 






208.0 


19 


21. 


5/0 


70% 






215.0 


8 


22. 


3% 


onO/ 

>80% 


Al538-5z2 = 


QV Nor 


516.1 


20 


1. 


2% 


7nO/ 

70% 


X1700-377 = 


HD153919 


508.0 


36 


5. 


4% 


r /I 0/ 

54% 


VI Qi^f;_i_'?f;n — 
-A-iyoun^oou — 


- Pvcr Y 1 


OlO. 1 


fin 


7 


o /o 


44% 






601.1 


36 


15. 


8% 


54% 


X2030+407 = 


: Cyg X-3 


203.0 


476 


13. 


3% 


15% 






2.0 


296 


14. 


4% 


21% 






7.1 


133 


10. 


8% 


30% 






212.0 


243 


14. 


8% 


23% 






303.2 


77 


16. 


1% 


38% 






328.0 


67 


27. 


9% 


42% 






331.0 


30 


26. 


4% 


57% 






331.5 


48 


15. 


9% 


47% 






333.0 


64 


3. 


6% 


42% 






601.1 


32 


26. 


1% 


57% 






34.0 


54 


5. 


1% 


47% 






pl2Q 


1039 


10. 


6% 


12% 



''Combined data from Phases 1 and 2. 



Table 4.7: High mass X-ray binaries. For each source, the viewing period of the 
observation is hsted, along with the average number of photons in each phase bin, 
the modulation fraction that would be measured if all the source flux were modulated, 
and the threshold modulation fraction that would yield a 99% significance detection 
in half of all possible data sets. 
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Source Name X-ray Orbital Significance of 
Period (hr) 7-ray modulation 



AUt)4o-DoZ 


ZD.U 


U.zo/o 


X0547-711 


10.6 


56.0% 


XI 124-685 


10.4 


0.7% 


4U1323-62 


2.93 


0.25% 


Cen X-4 


15.1 


54.4% 


X1624-490 


21.0 


54.2% 


XB1636-536 


3.8 


8.8% 


Her X-1 


40.8 


5.0% 


XB1658-298 


7.1 


49.7% 


GX339-04 


14.8 


60.4% 


X1755-338 


4.46 


1.% 


4U1957+11 


9.3 


0.25% 


X2023+338 


5.7 


94.0% 


X2127+119 


17.1 


0.65% 



Table 4.8: Low- mass X-ray binaries that were searched with timevar, their orbital 
periods, and the significance of 7-ray flux modulation as found by timevar. 



Source Name X-ray Orbital Significance of 

Period (hr) 7-ray modulation 
LMC X-4 33^6 77.5% 
LMC X-3 40.8 10.1% 
LMC X-1 101.28 54.5% 
Cyg X-3 4.8 17.8% 



Table 4.9: High-mass X-ray binaries that were searched with timevar, their orbital 
periods, and the significance of 7-ray fiux modulation as found by timevar. 



Chapter 5 



Conclusions 



Most astrophysicists consider the study of statistics and statistical methods only 
slightly more interesting than taxonomy and speeches by university presidents. While 
this feeling is by no means unique to astrophysics, it is particularly unfortunate in 
a field where statistics play such a pivotal role in our understanding of the scientific 
data. 

Astrophysics may be differentiated from other specialties by the unique nature of 
experimentation. In fact, we have a separate word to describe experimental astro- 
physics: observation. The choice of language highlights the fact that in astrophysics 
more than any other realm of physical study we are most often passive collectors of 
data, rather than active experimenters manipulating controlled environments. This is 
not to say that astrophysicists are lazy; indeed, enormous amounts of work have gone 
into the design, construction, and analysis of all kinds of observatories. Concentrating 
on high-energy astrophysics, we have seen the great contribution of SAS 2, COS B, 
and EGRET. We anticipate further advances from GLAST . 

Unfortunately, the institutional disinterest in statistics has put some of this tremen- 
dous effort to waste. 7-Ray instruments are carefully designed to have great sensitivity 
to individual photons. The scarce quanta are jealously collected, for they each con- 
tain a great deal of information. Indeed, the entire theoretical field of 7-ray burst 
mechanisms has been forced to account for a single EGRET photon [Q. It seems 
puzzling, then, that after all the quality efforts made to improve the capabilities of 
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each of the previous 7-ray telescopes, the statistical analysis methods were developed 
as an afterthought. The result | |117| was that the capabilities of the EGRET instru- 
ment were not fulfilled until very late in the mission ||6^. Since LIKE does not fully 
use all the information in the data, various attempts have been made to cajole the 
standard analysis programs into being more efficient with the data. 

An example will help clarify the point. LIKE applies the average photon point- 
spread function to all photons in the data set. Of course, the width of the EGRET 
point-spread function depends strongly on the photon energy. This causes LIKE to 
understate the information in the location of high-energy photons, and overstate 
the information in the location of low-energy photons. The net result is that the 
error estimates of point-source locations is significantly larger than it needs to be. 
The data set may be restricted to a smaller energy range | |104| , thereby making the 
average point-spread function a better estimate of the actual point-spread function 
for more of the photons in the data set. However, this improvement comes at the 
expense of cutting drastically the number of photons available. The efforts of the 
instrument designers to capture more photons have been wasted. 

There are many factors that go into the design of statistical tools to analyze 7-ray 
data. Simplicity of design and computational speed are often taken as the driving 
considerations. A better approach, however, is to begin with a correct statistical 
implementation that uses all of the information in the data. This theoretical imple- 
mentation can then be simplified to achieve speed and simplicity requirements. The 
advantage of the "top-down" approach is that the approximations and simplifications 
can be made rationally, fully weighing the losses in accuracy against the gains made in 
other areas. This is the spirit in which timevar has been designed for periodic signal 
analysis. Tompkins | p.99|| has developed a successor to LIKE along these lines. The 
result is a statistical method which is no more complicated than LIKE that computes 
in reasonable speed and offers much better position estimates and much smaller error 
regions. Such an approach avoids the pitfalls of an empirical design of statistical 
methods, in which ad hoc adaptations may have unforeseen consequences (§ [4.2.1| ). 

We stand now on the brink of the next generation of 7-ray astronomy. The efforts 
of scores of scientists at dozens of institutions throughout the world are producing 
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the design of a 7-ray telescope that will revolutionize high-energy astrophysics. While 
experts make extensive computer simulations of the instrument to optimize the design, 
and engineers carefully design the electronic, mechanical, and thermal structures of 
GLAST, relatively little effort is going into the design of the statistical apparatus 
to analyze the wealth of data that will someday pour forth. Unfortunately, this 
is not a project which can profitably be left until the design stage is over. Realistic 
interpretations of computer simulated data rely on high-quality data analysis software. 
The design of that software depends (§ p.2|) on the instrument design. 

The final goal of any telescope is to make observations which lead to increased un- 
derstanding of astrophysical objects. In the case of 7-ray telescopes, this is achieved 
by the design of an instrument that simultaneously optimizes various scientific goals 
(sensitive area, point-spread function, energy range and sensitivity, photon timing) 
with various spacecraft limits (power consumption, heat production, telemetry lim- 
its). It does so by optimizing different instrument subsystems: the e~e~^ tracking 
system, the calorimeter, and the triggering system, among others. A well designed 
instrument would consider the data analysis methods and software to be a separate 
subsystem, just as important to the success of the instrument as the other systems. 
In fact, instrument design choices must take into consideration the impact on data 
analysis. Various instrument design choices, such as pointing modes, zenith cuts, and 
instrument modes can have significant impact on the data analysis process, and in 
some cases, can limit the precision of the results. 

The purpose of this work is to elucidate the statistical methods best suited to 
the analysis of periodic flux modulation in 7-ray data. These methods should be ap- 
propriately modified for use with GLAST. In addition, the calculation of thresholds 
for various GLAST configurations and modes can be used to concentrate analysis 
efforts on objects that seem most likely to be detected. It is likely that some sim- 
plification of the methods will be necessary to enable the analysis to be performed 
in a useful and timely way; but these simplifications can now be made from a basis 
of a well-developed statistical method, rather than in an ad hoc manner which may 
unnecessarily degrade the quality of GLAST results. 

Statistical techniques are the double-edged sword of high-energy astrophysics. 
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Without them, the field would not exist. The existence of high-energy emission from 
some 7-ray bursts would be unknown. Our understanding of pulsar energy-generation 
mechanisms would be strikingly curtailed. Nevertheless, their misuse has caused large 
segments of the astrophysical community to doubt their validity. So called "4(t" re- 
sults are believed with a confidence of about 90% — far from the nominal 99.9937% 
which they claim. This lack of confidence comes from the misuse and misunder- 
standing of statistics and statistical methods, which is in turn a direct result of the 
general disinterest in statistical methods in astrophysics. I hope that this work has 
inspired interest in developing useful and correct statistical methods that dispel the 
indifference and improve the results of future high-energy astrophysical experiments. 



Part II 

The October 1997 
GXA5^T-prototype Beam Test 
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Chapter 6 



GLAST: The Next Generation 



The success of the EGRET 7-ray telescope has answered many questions, but it 
has also given rise to new ones. The bounty of unidentified EGRET sources un- 
doubtably holds the key to understanding a wide variety of astrophysical systems. 
Several of these sources at low Galactic latitude are likely to be Geminga-like pulsars 
215| , |126| , |163|| . High-latitude sources may be unobserved AGN, or may be a new class 



of sources not yet associated with 7-ray emission [156|. Furthermore, EGRET has 
positively identified many 7-ray sources that deserve further study. While a number 



of 7-ray pulsars have been extensively studied |152|| , additional high-quality 7-ray 
data would discriminate between competing models of energy-generation mechanisms 
|61| , p6| , |171|| . Multiwavelength campaigns to simultaneously observe AGN from ra- 
dio wavelengths to 7-rays have become an important tool in understanding energy 
generation in these distant yet powerful galaxies |138| |. The recent discoveries of 
optical counterparts to 7-ray bursts fSlI] underscores the need for a large field-of-view, 
high-energy 7-ray detector. In order to achieve these goals we require a 7-ray tele- 
scope with a large effective area, a narrow point-spread function, and good energy 
and timing resolution. 

A proposed future telescope to that end is GLAST, the Gamma-ray Large Area 
Space Telescope (Figure |6.1|) . GLAST will be based on solid-state silicon strip detec- 
tor technology to provide high-quality e^e+ tracks from pair conversion events which 
can be reconstructed to give good directional information about the incident 7-ray. A 
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Figure 6.1: Artist's conception of the GLAST satellite. The square detector array is 
held away from the spacecraft bus to minimize background. Courtesy of the GLAST 
Facilities Science Team. 

calorimeter will provide energy information and possibly some directional information 



as well 1 134, |, M, UB 



6.1 Potential Improvements 

Given the great success of EGRET, is there any call to spend significant time, en- 
ergy, and resources to build the next generation 7-ray telescope? Potential budget 
ramifications are beyond the scope of this work; nevertheless, it is useful to look into 
the science gains that might be hoped for. We will focus on the 7-ray bursts, pulsars, 
and X-ray binaries, the subject of Chapter |^ and Chapter ^. 

7-Ray Bursts. As we discussed in Chapter ^ only a handful of 7-ray bursts were 
detected by EGRET. It seems likely that some bursts have a very soft spectrum, or 
perhaps a spectral break, so that the 7-ray luminosity was far below the EGRET 
threshold. However, there are also a number of instrumental limitations on EGRET 
7-ray burst observations. The first of these is the instrument field-of-view. BATSE 
observes approximately one burst per day in the entire sky. The EGRET field-of-view 
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is somewhat less than 1.5 sr; an instrument with a field-of-view closer to 27r sr would 
observe a bursts approximately every day as well. At any given time, EGRET has a 
sensitive area of 800-1000 cm^. Increases in the sensitive area translate directly into 
increases in the number of photons observed from each burst, yielding better infor- 
mation on burst flux, time profile, and position. Perhaps most importantly, EGRET 
estimates of 7-ray burst flux were strongly constrained by dead-time considerations. 
With the EGRET instrument inactive for about 100 ms after every trigger, many 
7-ray burst photons are probably missed in the initial spike of 7-ray flux. There is 
no way to estimate the missed flux without recourse to another instrument, since the 
burst time scale is of the same order as the dead time. Thus, the ideal high-energy 
7-ray burst instrument would have a wide field-of-view, large sensitive area, good 
angular resolution at low energies, and very short dead times. 



Pulsars and X-ray binaries. The two primary difficulties in searching for flux 
modulation from pulsars are the low signal-to-noise ratios, and the large region of 
parameter space that must be searched. The instrumental parameters which alle- 
viate these issues are increased sensitive area, and improved point-spread function. 
Increasing the sensitive area means that more photons are detected in less elapsed 
time. The sampling density required (§ f4.2.3| ) depends on the total observation time, 
so the increased sensitive area means that, for the same signal-to-noise ratio, period 
space may be searched less densely. The increased sensitive area combines with an 
improved point-spread function to improve the signal-to-noise ratio. This lowers the 
thresholds for both pulsar and X-ray binary pulsed flux detection. It is likely that 
X-ray binaries other than Cygnus X-3 will be detected with GLAST in steady state, 
and many of the X-ray binaries listed in Table |4.(j| and Table will be above the 
threshold for the detection of flux modulation. In addition, the sensitivity of GLAST 
to photons down to ~20 MeV will greatly enhance the sensitivity to soft sources like 
X-ray binaries. 

The proposed GLAST instrument will excel at all three of the desired capabili- 
ties. As described in the next section, the sensitive area will be nearly an order of 
magnitude greater than that of EGRET at high energies. The point-spread function 
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7-ray Telescope Characteristic 



GLAST Baseline Performance 



Energy Range 
Energy Resolution 

Effective Area 

Point-Spread Width 

Off-axisQ width 
Field of View (FWHM) 
Point Source Sensitivity 

Point Source Location 
Mission Life 
Mass 
Power 

Telemetry (average) 
Orbit 



20 MeV to 300 GeV 

< 25%, 10 MeV to 300 GeV 

< 10%, 100 MeV to 10 GeV 
8000 cm2 above 1 GeV 
4000 cm2 at 50 MeV 

3.1° X 100 MeV/E 
(68% containment) 

1.4 X on- axis width 
2.6 sr 

3.5 X 10"^ photons cm^^ s^^ 

(1 year, E> 100 MeV, 5a significance) 

30 arcsec-5 arcmin 

5 years (2 year minimum) 

3000 kg 

600 W 

100 kbps 

600 km low-inclination 



"where the sensitive area drops to half of its on-axis value 

Table 6.1: Characteristics of the baseline GLAST instrument. Values found by 
glastsim simulation of the baseline instrument [ p!5|] . 



will be almost a factor of two smaller in radius, and the dead time will be about 100 
times less than the EGRET dead time. These factors will combine to make GLAST 
an excellent instrument for observing 7-ray bursts, pulsars, and X-ray binaries. 



6.2 The Baseline GLAST Instrument 

EGRET has revealed the 7-ray sky to be a vast resource of astrophysical information. 



As described fully in § |1.2| , the EGRET instrument is composed of three sections: a 
calorimeter, an e~e^ tracker, and an anti-coincidence system. GLAST will follow the 
same paradigm, although the technologies used for each component have advanced 
significantly in the two decades since EGRET was designed. 

The instrumental requirements for GLAST are driven by the scientific questions 
we wish to answer. Locating point sources and separating nearby sources requires 
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compact point-spread functions. Observing 7-ray bursts requires large photon collec- 
tion area and very short dead times at the lowest energies, where burst photons are 
more plentiful. Pulsar timing also requires large photon collection area, but the hard 
spectra of pulsars necessitates good sensitivity to higher energy photons. Spectral 
measurements of all sources require good energy resolution. Monitoring for transient 
events like AGN flares and 7-ray bursts requires a large field of view (FOV). In ad- 
ditional, all of these goals must be achieved while maintaining excellent (> 1 : 10^) 
background rejection to eliminate cosmic ray contamination, while staying within the 
structural and power constraints dictated by the spacecraft design. 

The GLAST instrument is still in the planning stages, and as such many options 
for achieving these goals are being discussed. Most of these discussions take place 
as variations on the baseline design as developed by an international collaboration 
of scientists from many institutions around the world (Table |6.2|) . This design was 
proposed for funding to the Department of Energy and NASA in 1998, as well as to 
other international agencies ||T5[. The baseline design currently calls for a 5 x 5 array 
of modular towers. Each tower would consist of 17 trays arranged to hold 16 layers 
of silicon strip detectors. Each layer would measure the location of passing charged 
particles in x- and y-projections. plane. In addition, each tower would contain 80 
CsI(Tl) blocks of approximately 2.3 cm x 3 cm x 31 cm arranged horizontally in eight 
layers of ten blocks in alternating x-y orientations to measure the total energy of the 
7-ray. PIN diodes attached to each end of each block allow differencing of the light 
detection in order to determine a lateral displacement. The baseline anticoincidence 
detector (ACD) covers the entire instrument to identify cosmic rays. It consists 
of plastic scintillator tiles 1 cm thick and approximately the size of a single tower, 
arranged in two offset layers to cover all cracks. 

The baseline tracker design consists of planes of silicon strip detectors (SSDs). 
Although some alternative designs (gas microstrip detectors and scintillating 
fiber detectors) are being considered, the prototype instrument tested in the October 
1997 beam test was based on silicon strip technology. Therefore, we will concentrate 
on the silicon strip baseline. 

Optimization studies have been performed using Monte Carlo simulations for 



CHAPTER 6. GLAST: THE NEXT GENERATION 



117 



Aerostudi, S.r.l. Shibaura Institute of Technology 

Boston University Sonoma State University 

Commissariat a L'Energie Atomique (CEA) Stanford Linear Accelerator Center 

Ecole Polytechnique Stanford University 

Hytec, Inc. Texas A&M University-Kingsville 

ICTP and INFN, Trieste U.S. Naval Research Laboratory 

Kanagawa University University of California, Santa Cruz 

Laboratory for High Energy Astrophysics University of Chicago 

Lockheed Martin University of Rome 

Max Planck Institut fiir Extraterrestrishe Physik University of Tokyo 

NASA Ames Research Center University of Utah 

NASA Goddard Space Flight Center University of Washington 

Table 6.2: Institutions in the GLAST Collaboration 

a variety of tracker configurations. Some of the considerations that go into such an 



optimization are discussed in § |7.2.1| . The best design found so far consists of 17 
trays of detectors, with 3.5% radiation length {Xq) Pb radiators to convert 7-rays 
to electron-positron pairs (§ p.l| ), and 400 fim thick SSDs with a 195 fim pitch, or 
distance between strips. Each tray has detectors on the top and bottom of a thick 
wafer of low-density material. Thus the SSDs on the bottom of one tray are close 
to those on the top of the next. Such a pair of SSD layers are treated as one logical 
"plane" for the purposes of e~e"'" measurements. Since there are no SSDs on either 
the top of the first tray or the bottom of the last tray, we have 16 detector planes 
in the tracker. Each layer will consist of a 5 x 5 array of SSDs, each one of which 
is 6.4 cm X 6.4 cm. The SSDs will be connected into chains along their strip axis, 
resulting in an effective size of 6.4 cm x 32 cm. The signal-to-noise ratio in each strip 
is approximately 23:1 for a minimum ionizing particle. 



Chapter 7 

Testing the GLAST Science 
Prototype 



In order to demonstrate the feasibility of the GLAST project, as well as to confirm 
observational parameters measured in simulations of the instrument, a science proto- 
type was constructed at the University of California, Santa Cruz, the Naval Research 
Laboratory, and Goddard Space Flight Center. It was tested in the parasitic electron 
beam at the Stanford Linear Accelerator Center (SLAC) in October of 1997 ||169|| . 
Modeling instrument response, both analytically and through Monte Carlo simula- 
tions, and verifying those results experimentally has become a viable way to optimize 
instrument design while minimizing costs 7- Ray reconstruction software was 

developed and tested with Monte Carlo simulations (Chapter P), and then was used 
to analyze the experimental results. Comparison of the actual beam test results with 
simulation confirmed that the simulations represent an accurate model of the beam 



test instrument, and by extension, of the baseline GLAST design [^. These results 
will be discussed in Chapter ||. 

7.1 The SLAC e" Beam 

The October 1997 beam test was conducted in End Station A at the Stanford Linear 



Accelerator Center ||169|| . End Station A is located at the end of the main linear 
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accelerator beam (Figure 7A). This choice of testing location was a combination of 
design requirements and practical considerations. For the purposes of our test, we 
desired a beam of incident electrons arriving approximately one at a time, which 
we could either directly measure with our apparatus, or first convert to high-energy 
7-rays. End Station A accommodates such a beam with the linac in parasitic beam 



operation [^. 

Parasitic mode means that the beam consists of particles which have been scraped 
away from the main beam. The main experiment at SLAG during October 1997 
was SLD, a search for Zq vector bosons. SLD required rapid, well-focused bunches 
of approximately 10^^ electrons or positrons for collision. The main linac produces 
bunches of electrons or positrons at a rate of 120 Hz, and accelerates them to 50 GeV. 
The SLD beam profile was defined with a small aperture to achieve a compact, nearly 
monoenergetic beam. Electrons (and positrons) in the wings of the profile would stop 
in the massive shields around the aperture. Bremsstrahlung 7-rays would continue 
forward, while the rest of the main electron beam was magnetically steered into the 
SLG arcs. These 7-rays could then be converted back to electrons by passing through 
a high-Z foil. The number of electrons per pulse could be limited by adjusting the 



size of the momentum acceptance in the transport line |38 



This resulting beam of "parasitic" electrons then consisted of a mix of electrons 
and positrons with a broad range of energies. Steering magnets selected electrons 
of the desired energy and delivered them to End Station A. The electron energy 
was tunable from approximately 5 GeV to approximately 40 GeV. The number of 
electrons per bunch was tunable over a wide range from less than one to many tens 
of electrons. 

Once the electron beam had been delivered to End Station A, we modified it 
appropriately for our own uses. For several runs, we took the beam directly as it 
came; usually a 25 GeV electron beam. This was useful to calibrate the calorimeter, 
to do backsplash studies with the anti-coincidence detector (AGD) and to look at 
straight tracks in the silicon tracker. However, most of the data was taken with 
a thin Gu radiator (usually 3.5% Xq) inserted in the electron beam. Between the 
radiator and the instrument was a large magnet known as B0. By adjusting the 
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Experimental Areas at SLAC 




Figure 7.1: Location of End Station A at SLAC. The October beam test took 
place in End Station A (ESA), also known as the Fixed Target Area. The linac was 
simultaneously providing beam for SLC 

magnetic field in B0, the electron beam could be steered aside and deposited in a 
hodoscopic calorimeter. If the electron in the beam shed a bremsstrahlung photon in 
the Cu foil, then its energy was reduced, and its deflection in B0 was greater. The 
hodoscopic calorimeter had 88 fingers arrayed horizontally to measure this deflection. 
This allowed us to tag the energy of 7-rays incident on our instrument to an accuracy 
of about 250 MeV. 

Events with multiple 7-rays in the tracker were quite undesirable. It was likely 
that only one of the 7-rays would be detected in the tracker, while both would deposit 
energy in the calorimeter. The apparent energy of such an event could be strikingly 
different than its true energy, and recognizing such events in the data would be very 
difficult. Therefore, it was desirable to keep the number of e~ per pulse small; for 
most of the runs, the momentum slits were set to allow approximately one e~ per 
pulse on average. Of course, the number of e~ in a pulse is a Poisson process, so there 
was exactly one electron in each pulse approximately one-third of the time. When 
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Figure 7.2: Diagram of the beam test experimental scheme. The 25 GeV elec- 
tron beam entered from the left, incident on a high-Z foil. Some electrons emitted 
a bremsstrahlung photon, which continued through the ACD, and into the silicon 
tracker. Some 7-rays converted to e~e~^ pairs in the tracker. The 7-ray energy could 
be measured with the calorimeter. A magnet, B0, deflected the original electron 
into the hodoscopic calorimeter. The angular deflection caused by B0 was roughly 
proportional to the bremsstrahlung 7-ray energy. 

the Cu radiators were in the beam to produce 7-rays, the rate could be a little higher, 
since not all e~ shed bremsstrahlung photons. 



7.2 The Beam Test Instrument 

The GLAST science prototype instrument was divided into three parts. The tracker 
consisted of 6 planes of silicon strip detectors, for precision measurement of the e~e~^ 
tracks. The calorimeter was composed of segmented blocks of Csl to measure de- 
posited energy. The anti-coincidence detector (ACD) was a set of plastic scintillators 
read out to photodiodes via wave-shifting fibers. Each of these three components was 
connected to the data acquisition system of End Station A [Q. 

7.2.1 Tracker 

Competing physical effects lead to the adoption of a number of tracker design con- 
figurations. Multiple scattering of the e~e~^ pair is the dominant source of error in 
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Figure 7.3: Pancake (left) and Stretch (right) tracker configurations, illustrated to 
scale. The gray bars represent the locations of the planes. The distance between the 
planes in pancake configuration was 30.0 mm; in stretch, 60.0 mm, except for the last 
gap, which was 30.0 mm. 

reconstructing the incident angle of low-energy 7-rays. Multiple scattering ( §8.1| ) 
refers to the process by which an electron passing through a material is defiected 
by many small scatters; it is generally inversely proportional to the electron energy 
1^. However, at high energies the granularity of the strip pitch can cause significant 
errors in the angle estimations. These two competing effects make two parameters 
relevant to the design of a silicon-strip 7-ray telescope: the ratio of the strip pitch 
to the gap between planes, and the amount of radiator inserted between planes to 
facilitate conversions. Reducing the pitch-to-gap ratio improves the resolution of the 
instrument at high energies at the expense of increasing the number of channels — 
corresponding to greater instrument complexity and power usage — or of reducing the 
field of view. Reducing the amount of radiating foil between planes decreases the 
amount of multiple scattering that each electron experiences, at the expense of fewer 
7-ray pair conversions — corresponding to a reduced detection efficiency and thus less 
exposure. 

In order to explore this two-dimensional parameter space, the beam test instru- 
ment was built with adjustable spacing between planes, and with adjustable lead 
radiating foils between planes. Each of the six cards built for the instrument had two 
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silicon-strip detectors (SSDs) attached to it; one with strips in the x direction and 
one with strips in the y direction. For consistency with GLAST documentation, a 
single detector in either direction will be referred to as a layer, while an x-y pair of 
detectors will be referred to as a plane. The test box was built with ten slots on 3 
cm centers to accommodate the cards, allowing us to vary the pitch-to-gap ratio by 
putting the cards in different slots. The beam test SSDs were 5 cm by 5 cm square, 
with a strip pitch of 236 fim and a thickness of 500 fim. Each SSD had 192 instru- 
mented strips, corresponding to 6 readout chips responsible for 32 strips each, and a 
total instrumented area of 4.6 cm by 5 cm ||169|| . 

In addition, each slot could accommodate a radiator card, a special card with no 
SSDs, but instead with a thickness of lead (Pb) foil. The distance between the lead 
radiators and the silicon detectors was approximately 2 mm. Radiator cards were 
prepared with approximately 2% Xq, 4% Xq, and 6% Xq to allow us to vary the total 
radiator in the instrument. 



Simulations before the beam test |Q suggested two instrument configurations 



that were adopted for study. The first, so-called "pancake" mode, consisted of 6 
planes of silicon, each containing an x and y layer, separated by 3 cm. This relatively 
compact configuration maximized the number of pair electrons contained within the 
tracker. However, at high energies when multiple scattering is small, the squat aspect 
ratio of this configuration accentuated the measurement error. The second mode, 
called "stretch," placed the planes as far apart as experimental conditions would 
allow. The first five planes were spaced 6 cm apart, and the last one was spaced 3 cm 
apart. This configuration allowed more low-energy pairs to escape the tracker, but 
minimized measurement error for the high-energy pairs. 

Data was taken in the stretch configuration with 2% Xq, 4% Xq, and 6% Xq 



Pancake 


Stretch 


0.00% 


0.00% 


1.71% 


1.71% 


3.71% 


3.71% 




5.4% 



Table 7.1: Radiation lengths of Pb available 
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Table 7.2: Energy bands used for analysis. The approximate geometric center of 
the energy band was often used as the average energy, implicitly assuming an E^^ 
spectrum. 



radiators, as well as with the radiator cards removed ("0% Xq"). In the pancake 
configuration, data was taken with no radiators, 2% Xq, and 4% Xq. There was not 
enough time to take 6% Xq data in pancake configuration. 

In most cases, analysis was done in ten standard energy bands to improve the 



statistics. The energy bands are defined in Table ^ 



7.2.2 Calorimeter 

Behind the silicon tracker was a prototype calorimeter made of segmented blocks of 
CsI(Tl). For all of the tracker data runs, the calorimeter consisted of eight layers 
of 6 logs, each 3 cm by 3 cm by 28 cm. Thirty- two blocks were made of CsI(Tl), 
and fully instrumented with photodiodes on each end. The other 16 blocks were 
made of Cu with holes bored into the material to simulate the equivalent number of 
radiation lengths of CsI(Tl). A diagram of the locations of the instrumented blocks is 
given in Figure \7.4\. The calorimeter, built at the Naval Research Laboratory, is fully 



described elsewhere For the purposes of the silicon tracker data analysis, the 

only calorimeter measurement used was the total energy deposited in all instrumented 
blocks. This was found by summing the counts from the analog-to-digital counters 
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Figure 7.4: CsI(Tl) Calorimeter in each projection. Shaded blocks are non- 
instrumented Cu. Unshaded blocks are CsI(Tl) with PIN diodes on each end. The 
beam was incident from the top of the diagram. 

(ADCs), subtracting the pedestals, and adjusting for the gain: 

Emeas = « ADC - I66O) (7.1) 

where a is 1.6 for gain 1, 0.25 for gain 4, and 0.06 for gain 7. The gain setting of the 
photodiodes used for measuring the energy deposition in the calorimeter blocks was 
recorded for each data run. 

7.2.3 Ant i- Coincidence Detector 

GLAST will be equipped with an anti-coincidence detector (ACD) to distinguish 
7-rays from the charged-particle background. While the ACD is designed to re- 
ject background charged particles, it is also very sensitive to the array of particles 
emitted from a 7-ray or other particle interaction in the calorimeter. High energy 
7-rays may "self- veto" as secondary particles created in the calorimeter activate the 
ACD. EGRET was especially prone to this type of event due to its monohthic anti- 
coincidence scintillator, and in fact displayed a precipitous decline in sensitive area 
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at high energies [|193|| . The GLAST ACD will be an arrangement of plastic scintilla- 
tor tiles covering the instrument which will register the passage of charged particles. 
The segmented tiles will provide independent measurements to facilitate background 
rejection while not reacting to backsplash events. To test the scintillators and the 
associated electronics, an ACD was developed for the beam test ||169|| . 

The system set up for the beam test consisted of 15 plastic scintillators. Nine 
of these were arranged along the side of the silicon tracker and calorimeter, and 
six were placed in two layers in front of the silicon tracker. The segmented design 
allows the discrimination of true charged particle events from backsplash by lending 
position information. The scintillators along the side of the beam test instrument 
were designed to verify this procedure. 

All of the beam test scintillators were read out via waveshifting fibers to photo- 
multiplier tubes. These fibers allowed the scintillators to be placed very close to one 
another, minimizing cracks through which charged particles might penetrate unde- 
tected. 



7.3 Summary of Data 

Data was collected in runs of up to two hours. The length of the run was limited 
by the amount of data that could be stored by the data acquisition system on one 
tape []T|. Over 400 runs were made over 30 days in October 1997, running 24 hours 
each day, 7 days each week. Approximately 30 collaboration members worked shifts 
to monitor the experiment and change instrumental or beam parameters as necessary. 
The data acquisition system recorded 2.1x10® triggers, which required more than 200 
gigabytes of tape. Only the 7-ray runs were useful for the tracker study; electron runs 
were used for backsplash and calorimeter studies. Useful 7-ray events were filtered 



from these triggers; this process will be described in §3.2 
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7.4 Simulations 

The October 1997 beam test was just as critical for the evaluation of GLAST simula- 
tion techniques as it was for the evaluation of GLAST technology. The verification of 
simulation results for the beam test indicated to what extent simulations of the full 
GLAST instrument could be trusted to accurately represent instrument performance 
in orbit. 

Simulations of the GLAST instrument have been successfully done using com- 
puter code called glastsim. The code is based on gismo, a toolbox of routines that 
simulates the interaction physics for a large number of particles with a large num- 
ber of materials These particles and their interactions are taken from EGS, a 
highly-tuned analytical model of quantum electrodynamical interactions and trans- 
port established by the particle physics community for the simulation of high-energy 
physics experiments ||147| , p5| . 



For the purposes of the science prototype beam test, we further modified glastsim 
to simulate the prototype instrument that we would actually be using. In the interest 
of realism, as much of the experimental apparatus was included in the simulation 
as possible. The SLAG electron beam is nearly monochromatic (to within a few 
percent in energy) because of the large steering magnets which are used to bring the 
beam to End Station A. The simulations thus assumed a monochromatic 25 GeV 
electron beam, directly incident on a 3.5% Xq foil radiator. The electrons would 
bremsstrahlung in the radiator according to the interaction cross-section. A magnetic 
field then swept away the incident electron, allowing any bremsstrahlung photons to 
continue into the instrument. Once the 7-ray entered the silicon tracker, it was 
allowed pair-produce using the standard EGS 7-ray interactor. 

Upon exiting (or missing) the tracker, the resulting particles were collected in a 
GsI(Tl) calorimeter. Since tracking the particle shower in simulations of the calorime- 
ter are complicated and thus quite slow, some simulations were done with monoen- 
ergetic incident 7-rays, without a calorimeter. When a calorimeter was used, it was 
composed of 8 layers of 8 GsI(Tl) blocks, each 3 cm by 3 cm, for a total of 13 radi- 
ation lengths. When the Monte Garlo data was read into the analysis code, energy 
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deposition into blocks that were not instrumented was ignored. Furthermore, since 
only a few percent of the electrons incident on the Cu foil shed a bremsstrahlung 
7-ray, it was significantly faster to simply inject a bremsstrahlung spectrum of 7-rays 
directly. 



Chapter 8 



Reconstructing Events 



All telescopes require a method of converting the raw data recorded by the instru- 
ment into relevant observational parameters. For a 7-ray telescope, these parameters 
consist of information about individual photons. A bright astrophysical 7-ray source 
might have an intensity of 10^^ cm~^ s~^. EGRET might measure 30 photons per 
hour from such a source. Compare this to an ordinary light bulb, which emits some- 
thing like 10^° photons per second. Of course, each optical photon is very much less 
energetic. Nevertheless, if the total energy from our hypothetical 7-ray source had 
been emitted at optical frequencies instead of 7-rays, our telescope would receive 
around two billion photons per hour. 

Clearly, the fact that photons arrive so rarely will profoundly influence the way we 
analyze our data. "Imaging" must be done statistically, with long integration times. 
Sometimes it may even be more advantageous to look at maps of some statistical 
measure, rather than directly at maps of intensity. Likelihood techniques used in 
analyzing photon information to derive astrophysical information for EGRET were 
discussed in Part |I[ Here we will concentrate on the process of deriving photon 
information for the basic instrument response, beginning with the mechanism of 7- 
ray pair production. 
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8.1 Pair Production 

High-energy 7-ray telescopes work on the principle of pair production. According to 
the rules of quantum electrodynamics (QED), a photon passing through matter may 
convert into a electron-positron pair. 

7 + nucleus — > + + nucleus (8.1) 



The probability of such a conversion taking place is roughly independent of the 
energy of the incident photon above 1 GeV, and falls off at lower energies. How- 
ever, not all interactions result in pair production. At low energies, photons tend to 
Compton scatter more readily than pair produce. At 20 MeV, approximately 70% of 
interactions in Si result in pair production. At 10 MeV, close to half of interactions 
in Si are Compton scatters The total interaction cross section for all processes 
is fairly constant down to about 10-20 MeV. While the full pair-production cross 
section is quite a complex function of incident 7-ray energy, electron energy, positron 
energy, nuclear recoil energy, opening angle, azimuthal angle, and recoil angle ||140 



several simplifying assumptions give simple estimates of bulk behavior For a 
homogeneous material, the intensity of the incident 7-ray beam falls off like 

I = Ioexp{-lt/Xo) (8.2) 

due to all interactions, where t is the thickness of material and Xq is the radiation 
length of the material. Therefore the probability of a particular 7-ray interacting in 
the material is 

P(t) = l-exp(-^t/X„) (8.3) 

Pair Production 7-Ray Telescopes. 7-Rays that pair produce offer an opportu- 
nity for detection. By tracking the resulting e~^e~ pair, we can estimate the incident 
7-ray energy and direction. The reconstructed energy will be the sum of the and 
e~ energies, corrected for energy loss in the instrument, and the incident direction of 
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the 7-ray must be the momentum-weighted average of the and e~ directions. All 
but the lowest energy electrons detected by GLAST will be relativistic, so we may 
use the energy-weighted average of their directions to calculate the incident 7-ray 
direction. 

Accurately reconstructing the particle tracks is therefore of great importance. 
Two effects hinder our efforts to do this. The first is multiple scattering of the 
electrons, (e"*" and e~ will be referred to collectively as "electrons.") At large angles, 
it is not Gaussian; however, the core of the distribution (out to approximately 3cr) is 
approximately Gaussian [^, with a projected width of 

Oo = ^^^^^V^^^(l + 0.038 lnx/X„) (8.4) 

for relativistic electrons, where E is the electron energy and x is the thickness of 
material traversed. In addition, there is some lateral displacement of the electron 
from one side of the material to the other. The rms width of the displacement 
distribution is given by 

Vrms = ^^^o (8.5) 

Note that multiple scattering becomes smaller, on average, with increased electron 
energy, and with thinner radiating material. 

The second effect which complicates track reconstruction is measurement error. 
Most technologies proposed or used for 7-ray telescopes are based on wires or strips 
made of various materials. When an electron passes near the strip, it "fires" or records 
a hit. The strip pitch clearly affects the resolution of the telescope. The strip pitch 
divided by the gap between planes roughly determines the minimum angle that the 
telescope can resolve. 

Given a set of strip addresses which have been hit, we must reconstruct the electron 
tracks and determine the parameters of the incident 7-ray. There may be noise hits, 
spurious tracks, missing hits, or ambiguous tracks. We are limited by measurement 
error, and by energy-dependent multiple scatter. Even if we have two well-defined 
tracks, we may not know the energy in each electron, only the combined energy de- 
posited in the calorimeter. Furthermore, the x and y projections of the instrument are 
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read out separately. Given a track in the x projection, the question of which y track 
corresponds to it is ambiguous. Clearly, a good method of finding and fitting electron 
tracks will be critical to the accurate estimation of the incident 7-ray direction. 

8.2 Track Reconstruction 

The problem of estabhshing the most hkely electron tracks falls naturally into two 
steps: finding and fitting. The first step consists of choosing which hits in the tracker 
are part of the track in question. Designing good algorithms to do this is an art; in 
fact, it is similar in some ways to the pattern recognition problems being worked on 
by computer scientists. The second step consists of making the best estimate of the 
track of the electron that caused those hits. The latter is a science — an optimization 
problem — and is by far the more tractable problem. We will address the simpler 
problem first. 

Least-squared Methods. The simplest method of track fitting is the linear least- 
squares fit. We simply fit a straight line to all of the hits in the track. Since we expect 
that the total angle scattered should increase as the track proceeds through additional 
layers, we assume the uncertainty in the measurement to grow increasingly larger as 
we travel down the tracker. This method has the advantage of being very simple and 
fast. At high energies, the track should be very nearly straight, so the least-squared 
fit line will be a good approximation to the real track. However, at energies where 
multiple scattering is significant, a straight line is a poor approximation to the actual 
track. A line fit in this way will have reasonable information about the incident 
track direction, but its estimation of the final track position and direction may be 
quite poor. If we wish to extrapolate the track to the calorimeter below, we will 
require better estimates of the track parameters at the bottom of the silicon tracker. 
Furthermore, a linear least-squares fit rolls the multiple scattering and measurement 
error into one general error. As we have seen, these errors behave very differently in 
different energy limits. 
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Figure 8.1: The Kalman Filtering process. A passing electron causes a strip to fire 
in each plane (light grey). The Kalman filter uses the current track direction (black 
arrow) to predict the hit location on the next plane (grey circle). The light grey area 
represents the range of likely directions after multiple scattering. The location of the 
strip which fires is used to correct the track location, and predict the track direction 
to the following plane. 
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Kalman Filters. Given a set of liits tliat make up a track, tlie optimal linear 
fitting method is called Kalman filtering PTl E^. "Kalman filtering" is really a two- 
step process, consisting of a "filter" and a "smoother." The filter begins at the first 
hit of a track, and makes a prediction for the location of the next hit. That prediction 
is refined in light of the measured hit location, and the error matrices are updated. 
This process, called "filtering," continues to the end of the track (Figure ^.1| ). Once a 
track has been filtered, it is then "smoothed." When the filtering process is finished, 
our estimation of the track in any given plane has no information about the locations 
of hits in subsequent planes. Smoothing incorporates that information. It steps back 
up the track from the bottom, further refining the track parameters at each step 
based on the information found further down the track (Figure ^.2[ ). Friihwirth pT] 
developed a practical implementation of Kalman filters that is applicable to particle 
track fitting. 

The filter must balance the competing effects of multiple scattering and measure- 
ment error. The problem simplifies immensely if either one of these is negligible. If 
the measurement error were negligible compared to the multiple scattering, as ex- 
pected at low energies, the filter would simply "connect the dots," making a track 
from one hit to the next. Most of the information about the 7-ray direction would 
come from the first two hits, where the cumulative effects of multiple scattering are 
the smallest. However, if the measurement error is significant and multiple scattering 
is negligible (as it will be for high energy photons), all hits have information, and 
we should essentially fit a straight line to the hits. The Kalman filter balances these 
limits properly for all energies, and thus earns its title as the optimal linear filter; in 
the limit that all errors and multiple scattering are Gaussian, it is the optimal filter. 
That means that if all is Gaussian, it is completely equivalent to both a fif and 
a maximum likelihood fit. The Kalman Filter was chosen for analysis of the beam 
test data. Accordingly, we will examine the method and its implementation in some 
detail in 98. 3|- 



Track finding. Track finding is a more subjective problem. The identification of 
which hits belong to a track is a pattern recognition problem which does not admit 
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Figure 8.2: The Kalman Smoothing process. Starting at the bottom and work- 
ing up, the track estimate (grey circles) is corrected further (black track), based on 
information about the track below the plane in question. 
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an analytic solution. The basic algorithm we have adopted is based on the filtering 
procedure described above. At each plane, we use a Kalman filter to predict the most 
likely location of the hit. We then assume, in most cases, that the nearest hit to 
that predicted location is the one which belongs to the track. This simple criterion is 
complicated by caveats that allow for tracks leaving the tracker and for tracks sharing 



the same hit. The complete algorithm will be discussed in §|8^. 



8.3 The Annotated Kalman Filtering Formulae 

We base the Kalman filtering equations on those found in Friihwirth ||51|. At each 
instrument layer, we have the projected state vector, the "filtered" state vector, and 
the "smoothed" state vector. The Kalman filter will successively calculate these 
estimates of the track parameters. The state vector will contain all parameters of 
interest about the track. These might include the lateral position in x, y, or both; 
the height z in the instrument; the direction of the track as either angles or slopes; 
and the energy of the track. The state vector may be chosen to contain information 
from only one projection, separating the problem into fitting the x and y projections 
separately, or it may contain all parameters for a simultaneous fit. However, a state 
vector of length n will require inversion of nxn matrices. Computationally, it may 
be advantageous to separate independent variables into separate state vectors, and 
fit them separately. For each layer, then, we have a system equation of the form 

Xfe = Ffc_iXfc_i + Wfc_i (8.6) 

where is the state vector containing track parameters in plane k, the F matrix 
is the propagator from one layer to the next, and the random variable Wfc_i is the 
multiple scattering. 

The F matrix takes the state vector on one plane to the state vector on the next 
plane. In general, this means it will combine directional information with position 
information to compute a new position. In the absence of a magnetic field, the direc- 
tion will not change. The F matrix is indexed since it implicitly contains information 
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about the gap between one plane and the next, which may not be the same for all 
planes. 

The Kalman filter must also consider the measurement process. Given a true track 
position X, we will measure 

rrife = HfcXfc + efc (8.7) 

where rrifc is the measurement that we make, H is the measurement matrix, and the 
random variable is the measurement error. 

Also note that the dimensions of x and m may not be the same. If all properties 
of the track are not directly measured, H will not be square. Again, remember that 
we can have a separate for every plane. If silicon tracker data were combined with 
data from a sampling calorimeter, the calorimeter's H matrix would include a term 
indicating an energy measurement. 

The second part of the system and measurement equations is the random variables 
we use to represent multiple scattering (w^) and measurement error (e^). If their 
expectation values are zero, it will be sufficient to consider their covariance matrices. 
By suitably defining our state vectors and other matrices, this condition can always 
be satisfied. For notational convenience, following Friihwirth, we define 



Qfc = cov{wfe} (8.8) 
Vfc = G^i = cov{e,} (8.9) 

(8.10) 

It is helpful to make the semantic distinction between multiple scattering and 
measurement error. Both appear as random variables in our equations, but it is 
important to remember that measurement error is a description of our imperfect 
measurement, while multiple scattering is a physical distribution. 

The displacement in the detector plane of the track due to multiple scattering is 
correlated with the angle through which the track scatters If z is the thickness 
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of the detector, and 60 is the rms multiple scattering width given by equation (8.4) 
then the multiple scattering covariance matrix is 

Q = I i I (8-11) 



zel/2 

assuming w has two components, horizontal position x and track angle 9. 
Finally, we have the covariance matrix of our state vector: 

Cfc = COV {Xfc - ^k,true} (8-12) 

This will provide our estimate of the errors in our estimate of the track parameters; 
that is, the estimate of the point-spread function for each electron track. The details 
of this estimation may be found in §8.9| . 

8.3.1 The Filtering Equations 

Now, we can write down the prediction, filtering, and smoothing equations. Note 
that the equations are only stated as found in Friihwirth, not derived here. First, 
we predict the next position, using the propagator and the position on the previous 
plane: 

Xfc,proj = Ffc_iXjfc_i (8.13) 

and the next covariance matrix, found by adding to the predicted covariance the 
effects of the multiple scattering that happens in plane k — 1: 

Cfe_iF^_i + Qfc_i (8.14) 

These equations express the propagation, according to F, of the position and 
errors, with the addition of the multiple scattering covariance. 

The filtering process refines the predicted position by using information from the 
measurement on that plane. We must first refine the covariance matrix: 
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(8.15) 



Recall that is the inverse of the measurement error covariance matrix. If 
measurement errors were huge, then the measurement would contribute very little 
information. Gk would be nearly zero, and the error matrix would be just the previous 
error matrix propagated to the next plane. 

This is a good place to consider a limiting case. If all of our state vectors were 
scalars, C would look like a^. Then ( |8.15| ) would look like 

<^%Uered = (V^e.... + ^measurement) ' (8.16) 

We are simply weighting our estimate of a by the quality of the measurements. 
Using our refined C^, we can calculate 

Xfc = Cfc {Ck,proj) ^^k,proj + HjCfclllfc (8-17) 

Again, the size of controls how heavily the measurement is weighted. 

In our limiting case, we now weight our track estimate by the quality of the 
measurement: ( 8.17| ) becomes 



•^filtered -^predicted -^measured /q r>\ 

-2 = -2 + -2 (8-18) 

filtered predicted measured 

8.3.2 The Smoothing Equations 

So far, all the equations have only used information about the measurements taken 
either on the same plane, or "upstream" of the current plane. Smoothing is the 
process of further refining each position estimate in light of the information from all 
the measurements, upstream and downstream of the current plane. We first calculate 
an auxiliary matrix A: 

Ajfc = CkFl{Ck+i,proj) ^ (8.19) 
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Then the smoothed position and covariance estimates are 



^k,smooth — Xfe ~l~ -^k{^k+l,smooth ^k+l,proj 

) (8.20) 

k+l, smooth '--^fe+ljprojV fc 



Ck,smooth — Cfc + A.kiCk+1, smooth Ck+l,vroi)-^k (8.21 



The difference between the smoothed and projected versions of and may be 
thought of as the effect on the state vector and covariance matrix of the data from 
planes k and below. Now we need to know how to calculate the influence of that 
data on the estimates for the current plane. The key is A^. First, consider the case 
with no multiple scattering, when is zero. Plugging into ( ^.19|) from ( ^.141) after 
adjusting indices, we find 

A, = C,Fl{{Flr'C-,'F-,') = F-,' (8.22) 

In this case, A is just the back-propagator, taking x^+i to x^. Then ( ^.20|) just 
propagates ^x^+i back to plane k, exactly what we would expect with no multiple 
scattering to complicate the issue. 

The limit that multiple scattering is large compared to the covariance matrix is 
more complicated. Returning to the scalar case, the various C's become a^, and F 
becomes a scalar. Then our A becomes 

a = (8.23) 

k+l,proj 



Again plugging in from the scalar equivalent of ( ^.14] ), we end up with 

.2 

k, filtered 



f4 

f'^'^k, filtered + ^ms 



(8.24) 



In the limit that crlj^n^^^j. <C a.^^, this simplifies to 



r2 



filtered 



<^ms 



(8.25) 



So, the bigger the expected multiple scattering, the more we discount the information 
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in the planes below. In fact, both extremes can be seen in the scalar version: as 
multiple scatter goes to zero in ( |8.24D , a approaches unity — the information from 
later planes is not discounted at all. 

Now we have calculated the smoothed state vectors and covariance matrices at all 
points along the track. These revised estimates represent the optimal linear filter of 
the particle track. If the measurement and multiple scattering errors were Gaussian, it 
would be the optimal estimate of the track, and equivalent to a maximum likelihood 
estimate. The smoothed state vector for the first plane gives us the initial track 
direction, while the smoothed covariance matrix for the first plane gives us the "point- 
spread width" for that track. 



8.3.3 Goodness of Fit 

If the Kalman filter, under some assumptions, is identical to both least-squared fitting 
and maximum likelihood, we should demand that it produce a value or a likelihood 
as a measure of its goodness of fit. In fact, it does produce a "running x^" as it filters 
and smooths. 

For each plane, we find the residual vector 

Tfc = rrifc - HfcXfc (8.26) 

and the covariance matrix of the filtered residuals 



R-fc = Vfc — HfcCfeH^ (8.27) 

The incremental is then 



xi = riR,'rk (8.28) 

The total of the track is given by the sum of the contributions for each plane. 
The smoothed incremental can be similarly calculated: 

^k, smooth ^^k, smooth ^^k^k, smooth (8.29) 
,smooth 

Vfc — likCk,smoothiik (8.30) 
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X+ ^k,smooth^k,smooth^k, smooth (8.31) 

The incremental fo^' each plane is distributed as x^i^k), where ruk is the 
dimension of rrifc ||5T[. In fact, it is precisely in this sense that we may call it x^- Since 



it measures residuals due to multiple scattering as well as those due to measurement 

error, strictly speaking it is — 21n£, which of course is distributed as [ [^llj , 

This test may be used as a way to identify track outliers. A measurement with a 
value corresponding to the (1 — a) quantile may be rejected as not belonging to 

the track. This process will reject a measurement which actually is part of the track 

with probability a. 

The complete set of Kalman filtering equations is summarized in Appendix 0. 



8.3.4 Kalman Filter Implementation for the Beam Test 



In order to analyze the data from the October 1997 beam test, the Kalman filtering 
equations above were implemented in a C++ program called tjrecon. This program 
combined track finding and track fitting into one package which could analyze both 
beam test and Monte Carlo data. The track finding algorithms will be discussed in 



8.4 



To maintain generality, the implementation was designed to be as flexible as pos- 
sible. The X and y projections were fit separately, and their results combined after 
the fitting process. The state vector was chosen to be: 



^ horizontal position \ 

track slope 
1^ current energy y 



(8.32) 



The propagation matrix was then simply (aside from unit conversions) 

/l 1 0\ 
F = 10 
[o l) 



(8.33) 
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Recall from equation (|8.4| ) that the multiple scattering angle is approximately 
Gaussian distributed. Since we are using the slope of the track, the (1,2) component 
of F should really be tan(track slope). Of course, this destroys the linearity of the 
propagation equation, and with it the Kalman filter. So, we assume that the track 
slope will be small so that tan^^ ^ 6. Consequences of relaxing this requirement will 



be discussed in 53.10 



We assumed a measurement vector nijt with an energy component, to allow the 
possibility of extending the Kalman filter to the calorimeter. 

(strip number \ 
(8.34) 
current energy / 

Since there was no energy measurement information in the tracker, our H matrix 
was, for all planes, 

Hfc = I ^ ° ° I (8.35) 
' 

8.4 Track Finding Algorithm 

Track finding is significantly more difficult than track fitting in the following sense: 
it is impossible to rigorously prove that one track finding algorithm is better than 
another. One must implement both algorithms and run them on the data in question. 



8.4.1 The Exhaustive Search 

An exhaustive combinatorial search is the exception; it is guaranteed to find the best 
track. It is also simple to implement: for every possible combination of hits for each 
of the two tracks, apply the Kalman filter and look at the x^- The best tracks are the 
combination that give you the lowest total x^- However, the method soon becomes 
computationally infeasible. For a single track, the number of combinations n to try 
is roughly 
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where p is the number of planes and h is the average number of hits. For the beam 
test, p was 6, and h was typically between zero and 6. This would imply several 
hundred possibilities for each track in each event; an achievement which would be 
possible, though probably not at real-time speed. However, a single tower of GLAST 
will have 16 planes and could have as many as 10 hits per plane. An exhaustive search 
would require 10^^ trials, which is clearly not feasible for the expected data rates. 



8.4.2 Beam Test Algorithm 

In an attempt to avoid the computational complexity of an exhaustive search, we 
will try to make a good initial guess for the track, then vary that guess in order to 
optimize the goodness of fit. The Kalman filter naturally suggests the outline of a 
track finding algorithm, based on the prediction of the track location in the next 
plane. However, there are myriad details of such an algorithm left unconstrained. 
Unfortunately, there is no continuous metric to guide our choices — the only metric by 
which to compare different algorithms is to implement them, and see which one does 
the best. The specific choices we made in designing an algorithm for the beam test 
were verified on Monte Carlo data to succeed rather well in reconstructing events. 
However, we can make no claim that this algorithm is optimal, nor that it will be the 
basis for an algorithm for use in GLAST . 

To make an initial guess for the track location, we find the first plane with a 
hit. Presumably, the 7-ray converted in the lead above this plane, or in the plane 
itself. We assign the leftmost hit in the first plane to the track, and the leftmost 
hit in the second plane to be the second hit in the track. These will serve as initial 
guesses, but we will explore other possibilities later. Given these two hits, we may 
run the Kalman filter, predicting a state vector, and therefore a position, for the third 
plane. We assume that the closest hit to the predicted position is part of the track. 
Continuing this procedure to the bottom of the tracker yields our initial guess. 

The metric for determining a good track is the total calculated in equa- 
tion ( |8.29| ) in a slightly modified form. Because of the existence of noise, double hits, 
and the possibility of tracks leaving the instrument, we construct penalties which 
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we add to the total when a track makes an undesirable choice of hits. This is 
perfectly acceptable in the language of likelihoods — we have additional information 
to suggest that such a track is unlikely, so we subtract from the ln£ of that track. 
However, while the relative sizes of some penalties may be rigorously derived, their 
actual values must be determined empirically. 

There are two special cases that must be considered when making our initial track 
guess. The first is the possibility that the track projects to a horizontal location 
outside the tracker. In that case, the code creates a "virtual" hit with a position 
outside the tracker. The track must pay a substantial penalty to its iii order to 
use this hit. 

The second special case involves the so-called "noisy strips." It was found (§PT^) 
that a number of strips were defective. They were either completely unresponsive, 
never recording a hit, or they were very noisy, recording a hit even when no electron 
passed through. All of these strips were marked as "noisy," and masked off. Never- 
theless, the possibility remained that an electron actually did pass through the noisy 
strip. In order to allow that possibility, the algorithm checks for noisy strips near 
the projected track position. It then creates virtual hits at each noisy strip within a 
small (10 strip) radius around the projected position. The track must add a penalty 
to its ill order to use a noisy hit. 

It is assumed that each event contains an e^e^ pair. Once the first track guess 
has been made, a second track is proposed, starting with the same hit in the first 
layer and the rightmost hit in the second layer. This track guess is made in the same 
way as the first, yielding a pair of initial track guesses. 



Track Optimization. Once an initial track is established for each electron, several 
steps are taken to optimize the choice of hits. First, the two tracks are untangled so 
they do not cross, regardless of whether the respective values decrease by doing 
so. Subsequent track optimization proceeds more smoothly if the tracks are initially 
untangled. This subsequent optimization will allow tracks to cross if that is justified 
by the data. 

On each plane, there may be hits which are not a part of either of the tracks. The 
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algorithm next checks each of these to see if replacing any hit in either track with an 
unclaimed hit will improve the fit. The free hit is substituted, and following projected 
positions are re-evaluated, substituting nearby hits if appropriate. 

Then the hits of the two tracks are swapped, plane by plane, to see if swapping 
hits will increase the total of both tracks. 

The last part of the track finding algorithm consists of checking the possibility 
that a track may have left the tracker. Often, a track may take a sharp scatter 
to leave the tracker, with the result that the predicted position is inside the tracker. 
The large penalty on leaving the tracker prevents tracks from choosing this virtual hit 
unjustifiably, but the even larger penalty on sharing hits prevents unrealistic swerves 
in the track to make a single hit. 

Finally, the issue of terminating tracks must be addressed. For example, if one 
track in the x-projection leaves the tracker, then one of the tracks in the y-projection 
must be terminated. To determine which track has left, it is assumed that the track 
with the higher probably has swerved precipitously to incorporate hits that do 
not belong to it. These hits reside on planes below that at which the track left the 
tracker. Therefore, the track is terminated at that plane, and is then taken to have 
no further hits. 

A summary of the track finding algorithm may be found in Appendix 0. 

8.5 Measurement Error and Multiple Scatter Es- 
timates 

Of critical importance are the values used for the measurement error and multiple 
scattering. For the measurement error, we assume that any electron coming within 
one-half the strip pitch on either side of the strip will cause it to fire. This would 
result in a square distributions. The Gaussian with the same mean and variance has a 
standard deviation of the box width over vT2. (This is easily shown: the variance of a 
distribution is its second moment, so we find J^il2 x^P{x)dx. For a square normalized 
distribution, P{x) = 1, and the integral equals 1/12; see Figure |87^ . ) There was no 
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Figure 8.3: Measurement error distributions. The Gaussian approximation and 
square approximation to the measurement error distribution. Both are normahzed 
and have variance 1/12. 

energy measurement, so its variance is arbitrary. 



V 




(8.36) 



in units of strips 



According to equation ( |8.4| ), muhiple scattering depends on energy. We must 
therefore estabhsh the energy of the electron before we begin to fit its track. A 
samphng calorimeter could, in principle, provide this information. Otherwise, some 
approximation will have to be made. For the beam test, we assumed that the energy 
in each track was half the calorimeter measured energy. The thickness of the SSDs 
was 500 ^m, or about twice the strip pitch. From the multiple scattering equations, 
it follows that the multiple scattering covariance matrix Q should be approximately 



Q 
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(8.37) 



1/ 



If the state vector were to carry a meaningful energy estimate, the (3,3) component 
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of Q should be the variance of the energy loss by the electron as it passes through 
the detector. However, the distribution of energy losses suffered by electrons passing 
through material is not at all well described by a Gaussian, and special care must be 
taken to account for these distributions ||184|| . Potential improvements to account for 



these effects are discussed in 58.8 



8.6 Noise 

Even before the beam test actually began, it became clear that there would be some 
dead strips that never fire, and some noisy strips that fire far too often. At the time of 
the beam test, the GLAST collaboration did not have the capability to wire bond the 
individual silicon strips to the front-end readout chips. Therefore, a private company 
was contracted to perform the wire bonding. When the detectors were returned from 
the wire bonder, several of the channels drew significantly more current than they had 
before the wire bonding. Subsequent visual inspection revealed obvious mechanical 
damage to several of the strips in the form of scratches; in addition, some of the 
bonds were of inferior quality. Unfortunately, the detectors were returned from the 
contractor only one week before the beam test began, and no repair was feasible. We 
were thus left to incorporate the dead and noisy strips into our software analysis. 

For our purposes, both dead and noisy strips were treated in the same way, and 
will be referred to as "noisy." Such noisy strips will clearly confuse the track finding 
routines, and they must be dealt with. 

Initially it was thought by many that there would not be any noisy strips at all 
in silicon strip detectors. In fact, carefully prepared silicon strip detectors do exhibit 
very few noisy strips. However, the prototypical nature of the beam test instrument, 
and especially the necessity of outsourcing some fabrication, led to several types of 
noise (Figure ^.4|) . 

First of all, about 2/3 of the planes exhibited one or a few strips with high 
occupancy — typically higher than about 20% of triggers. In addition, there were also 
a fair number of strips which were dead. The reconstruction code merely identified 
these strips and flagged them for the reconstruction routines. 
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Figure 8.4: Occupancies in two planes during a typical run (Run 376). The plane 
on the left is almost completely noise free; all recorded hits were due to electrons 
passing through the plane. The plane on the right was mishandled by an outside 
wire-bonding contractor; it displayed excessive noise. The Gaussian shape of the real 
electron hits can be seen between the noisy channels. 



In addition, there were two special planes to worry about. The first was a y- 
layer with a malfunctioning readout chip. The silicon strips connected to this chip 
were independently tested and found to work correctly; however, the readout chip 
reported a set of 32 strips to be either (almost) all on, or all off. The strips numbers 
were approximately 32 through 64. 

The other special plane was the so-called noisy plane. This was an z-plane that 
exhibited a large number of noisy strips. The reasons for the large numbers of noisy 
strips has been traced to some quality assurance problems; they are detailed else- 
where | ]169 |. 

The first attempt at identifying noisy strips was done by hand. Strips were noted 
as noisy when they appeared to have a higher occupancy than expected based on 
other nearby strips. This method proved tedious in light of the number of strips 
involved, and the fact the the noisy strips are not always constant between runs. 
On a gross level, the location of the noisy strips changed several times during the 
beam test as a result of the physical rearrangement of the planes within the tracker. 
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Clearly, noisy strips moved with the planes. In addition, several studies of the effect 
of different discriminator thresholds changed the noise levels. 

For these reasons, we decided to automate the noise finding procedure. The 
algorithm works as follows: first, all strips with an occupancy above 15% were marked 
as "dead." For our purposes, we flagged dead strips and excessively noisy strips in 
the same way. Then we looked for the plane with the bad chip. This was done by 
checking each y-plane in strips 32-34 and strip 60. If they were the same to within 
a few percent, then that plane was assumed to hold the bad chip. The entire range 
from 32-64 was marked as dead. 

Next we looked for noisy strips in the wings of the distribution. Starting at strip 
95 (approximately the center of the plane) we moved out to the left until the number 
of counts had dropped by a factor of 5.5 from its peak value, ignoring any dead strips. 
We then continued to the left and marked as dead any strip with more than one fifth 
the number counts as the maximum channel. The entire process was independently 
repeated for the right side. 

Finally, we fit a Gaussian to the core of the distribution. The tails were not fit, 
since they were clearly non-Gaussian. Any strip with more than 130% of the hits 
predicted by the fit is marked as noisy. The Gaussian fit roughly represents the 
number of legitimate hits, and any additional hits are noise. Thus, we throw away 
strips where more than 30% of their hits arc noise hits. 

The resulting list of noisy strips is passed to the reconstruction routines, which 
ignore all hits from those strips. 

8.7 Energy 

Energy resolution is an important part of any 7-ray telescope. Good measurements 
of the spectra of astrophysical sources and their cutoffs offer insight into their energy- 
generation mechanisms. Good energy resolution is also important for good 7-ray re- 
construction. To conserve momentum, the 7-ray direction must be a energy- weighted 
average of the e~ and e+ directions. Furthermore, the Kalman filter estimate of the 
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electron track depends on the expected multiple scattering, inversely proportional to 
the energy of the electron. 



8.7.1 Energy Splitting 

For these reasons, it is very useful to determine the energies of each electron. A useful 
first approximation is to simply assume each track has half the total energy measured 
in the calorimeter. This is a good enough approximation to allow reconstruction of 
the tracks. One might hope that an iterative approach would allow better energy 
resolution and better track fitting. We can imagine varying the fraction of energy in 
each track to maximize the likelihood. The energy split which produces the maximum 
total likelihood would be the most likely energy split. 

This method was implemented, but it was unstable. In every case, the maximum 
likelihood energy splitting put all the energy in the straighter track, with no energy in 
the other track. This instability is due to the non-Gaussian tails on the multiple scat- 
tering distribution, as well as the small angle approximation we made in the Kalman 
filter. Tracks make large scatters much more frequently than would be expected from 
a Gaussian distribution. In order to make these large scatters reasonably probable, 
the Kalman filter radically lowers the track energy. This means that the track with a 
large scatter will have a grossly underestimated energy. Some possible improvements 



on this method will be discussed in 5B.8 



8.7.2 Energy Dispersion 

Of course, the energy measured by the calorimeter is not the true energy. In the case 
of the beam test, portions of the calorimeter volume were not even instrumented. 
Therefore, we must expect some energy dispersion. We used the simplest method 
of estimating total energy from the calorimeter. Eric Grove of NRL |^ provided 
us with a simple function (equation ( |7.1| )) to translate raw analog-to-digital readout 
counts and gain settings to deposited energy. We did not try to correct for shower 
leakage by fitting the shower profile, or to try to separate the energy deposited by 
each electron. 
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Figure 8.5: Measured energy vs. true energy in the Monte Carlo calorimeter. Mea- 
sured energies are about 20% lower than true energies, since the calorimeter does 
not collect all of the 7-ray energy. In addition, the width of the measured energy 
distribution at any constant true energy is about another 20%. 



The hodoscopic calorimeter (§ |7.1|) , which measured the energy of the "leftover" 
incident beam electron, allowed a rough calibration of the 7-ray energy. However, 
this measurement was only good to an accuracy of about 250 MeV. The hodoscopic 
calorimeter did allow the detection of some events with multiple beam electrons in a 
single pulse. These events could be identified when more than one leaded-glass block 
of the hodoscopic calorimeter recorded energy deposition. 

From 10 MeV to several GeV, we compared the energy deposited in the Monte 
Carlo simulation of the calorimeter with the true energies of the particles injected into 
the simulation (Figure 53). Over most of the range, the calorimeter detected about 
80% of the incident energy. The distribution was far from symmetric; as expected, 
many higher-energy photons were recorded as lower-energy photons, because of energy 
leakage out of the back and sides of the calorimeter as well as energy deposition in 
non-instrumented blocks. 
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8.8 Potential Improvements 

There are a number of potential improvements to the sihcon tracker analysis. Most 
of these would make very little difference for the results of the beam test — largely 
because they would affect both the actual beam test data and the Monte Carlo 
simulations in the same way. However, they may be quite important for track finding 
and fitting of GLAST data. 

8.8.1 Track Fitting 

Improve estimate of measurement error distribution. The probabihty that a 
given strip will fire is not really a boxcar, of width one strip, centered on the strip. In 
fact, it is a complicated function of the collection area geometry and bias voltage on 

the strip. It is plausible, and empirically verified, that often two adjacent strips will 
fire. This can occur when the electron passes between the strips, depositing enough 
energy into each of them to raise their potential over the threshold, or when the 
electron passes through the silicon at an angle, depositing energy in more than one 
strip. Currently, the hit which is more easily incorporated into the track is selected. 
Clearly, there is information in the fact that the other strip has also fired. A careful 
analysis of the probability of each strip firing as a function of the electron position 
would allow an accurate characterization of the probability of the true electron posi- 
tion given that both strips fired. The width of this distribution could then be used 
in the measurement error covariance matrix V instead of the current 1/12 times the 
strip pitch. 

Transform multiple scattering distribution to Gaussian. The Kalman fil- 
tering formalism is fast and accurate because it assumes distributions are Gaussian. 
However, nowhere does it require that the state vector be composed of physically 
meaningful values. It may be possible to find a function that would transform the 
multiple scattering distribution to something more nearly Gaussian. In that case, the 
state vector would contain the transformed variable instead of track slope. This would 
result in better fits, and might allow energy estimation from the track shape. The 
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Kalman filter would find the most likely state vectors, yielding physically meaningful 
information via the inverse transformation. 

Transform measurement error distribution to Gaussian. It is unlikely that 
a more accurate measurement error distribution like that described above would be 
Gaussian. However, if a simple transformation would make it nearly Gaussian, we 
would achieve the same improvement as described for multiple scattering. 

Account for total material traversed. Since both multiple scattering and energy 
loss depend on the amount of material that the electron traverses, electrons that 
cross the planes at a steep angle will be scattered more and will lose more energy. 
Unfortunately, this is an inherently non-linear phenomenon; large scattering leads to 
greater hkelihood of crossing a plane at a steep angle, which leads to greater hkelihood 
of large scattering. 

Require agreement between calorimeter and tracker reconstructed direc- 
tions. By analyzing the shape of the electromagnetic shower in the calorimeter, it 
is possible to estimate an incident direction of the particle. This information should 
be combined with the tracker reconstruction to find the best estimate of the incident 
7-ray using all available information. 

8.8.2 Track Finding 

Exhaustively try the first two points in track. While a complete exhaustive 
search for the best track will be infeasible for GLAST, trying all possibihties of the 
first two or three planes in the track would be possible. Since the first few points 
are the most important in determining the rest of the track, this would increase the 
probability that the best track is actually found. 

Require vertex in material. Pair production can only occur in the presence of 
nuclei in order to satisfy momentum and energy conservation. Therefore we can ex- 
pect the two electron tracks to project back to a common vertex which is in high-Z 
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material. Since the conversion may well happen in lead, we may not have a measure- 
ment of the electrons at the vertex. However, we know the precise position of the 
lead foil, so the requirement that the vertex lie in high-Z material will improve the 
estimates of the tracks. 

Account for tracks which steirt with a dead strip. It may happen that a 
7-ray converts close to a dead strip. In that case, the first plane will not record a 
measurement of the track. The algorithm designed for the beam test would assume 
that the conversion actually happened in the next plane. However, it should be very 
rare that the conversion happens near a dead strip in both x and y. Furthermore, a 
fit track can be extended upwards to see if it could have converted in a dead strip 
above. Therefore, it may be possible to identify situations in which the first hit of 
the track is missing. 

Assign penalties to individual strips according to occupancy. Currently, 
dead strips and noisy strips are masked identically, and all are assigned the same 
penalty for track finding. However, some of the noisy strips contain significant in- 
formation. For a strip with a relatively low occupancy, many or most of the hits 
recorded may be due to actual electron tracks. A better penalty scheme would find 
the occupancy of each strip, and assign larger penalties to strips with higher occu- 
pancies. The magnitude of the penalty would be proportional to the likelihood that 
a hit registered in that strip was actually noise. 

Use empty triggers to find noisy strips. The occupancy of each strip could 
be measured by examining which strips arc hit when there is nothing in the silicon 
tracker. For the beam test, this could be done by looking at triggers in which the 
beam electron did not brcmsstrahlung in the Cu foil. For GLAST, this can be done 
by reading out the tracker when there has not been a trigger. Careful establishment 
of the occupancies, remeasured frequently, would allow accurate penalties to be set 
for each strip. 



CHAPTER 8. RECONSTRUCTING EVENTS 



156 



8.8.3 Energy Estimations 

Take into account energy loss per plane. It would be very easy to include 
energy loss in each plane. The energy loss in a material is well characterized. The 
(3,3) component of the propagation matrix F would then become the percentage of the 
electron's energy that it retains. The multiple scattering covariance matrices of lower 
planes would then be calculated based on this reduced energy. Unfortunately, the 
distribution of energy losses is asymmetric. A proper treatment of energy loss ||184 | 
requires this distribution to be taken into account in the track fitting. 



Try to get energy estimates from tracks. If the Gaussian transformations de- 
scribed above could be found, then the tails of the multiple scattering distribution 
would no longer undermine our efforts to estimate the energy of the track from the 
track itself. An iterative approach could then estimate the most likely energy, refit 
the tracks, and re-estimate the energy until it converged. Then the energy splitting 
information could be used to make better estimates of the incident 7-ray direction. 



Fit shower profiles in the calorimeter. Fitting the shower development profile 
in the calorimeter may lead to much better energy estimates. Such fits allow leakage to 
be accurately estimated. The GLAST collaboration calorimeter team is implementing 



shower profile fits [57 



Measure individual electron energies in the calorimeter. A finely segmented 
calorimeter would enable the separation of the energy deposition from the individual 
electrons. An accurate measure of the individual electron energies would increase the 
silicon tracker resolution by allowing better energy weighting of the electron track di- 
rections to find the incident 7-ray direction. However, finer sampling of the calorime- 
ter requires more electronics and thus more power, as well as more gaps between Csl 
blocks. Sampling granularity small enough to resolve individual electron energies may 
not be feasible for GLAST. 
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Use better energy splitting probabilities. Heitler |£8|, |140|| has calculated the 
probability distributions of the energy split between the electrons as a function of 
the total incident energy. While it may not be possible to get good energy splitting 
estimates directly from the tracks, it is often possible to determine which track is 
more energetic. The most likely energy split according to Heitler could be assumed, 
with the appropriate track receiving more energy. 

The relative probability of various energy splittings as a function of total 7-ray 
energy can be found from the total pair-production cross- sect ion. In the limit of an 
unscreened point nucleus, extreme-relativistic energies, and negligible nuclear recoil, 
in the first Born approximation, the cross-section as a function of energy splitting is 

mi 



^ = AaZ'rl - ^/ + 1) In [VE{1 - /)] (8.38) 

where / is the fraction of energy in one electron, E is the total 7-ray energy in MeV, 
Z is the atomic number of the target nucleus (Si or Pb), and tq is the classical electron 
radius, 2.82x10^^^ cm. This function is plotted as relative probability versus energy 



split for a variety of total 7-ray energies in Figure p.6| . For the lowest energies, the 
probabilities appear to be negative for extreme values of the energy split; this is an 
indication that the assumptions noted above have broken down. 



8.9 Calculating the Point-Spread Function 

The point-spread function of the beam test instrument could be estimated by the 
distribution of reconstructed 7-ray incident angles. GLAST may be calibrated in the 
same way either in a beam, or in space with a sufficiently bright 7-ray point source. 
However, the Kalman filter offers us an opportunity to calculate the theoretical point- 
spread function under our Gaussian assumptions. The variance in the estimate of the 
incident direction of each electron is given by the (2,2) component of the covariance 
matrix C. To calculate the point-spread function, we should look at the smoothed 
estimate of C in the top layer. We simply read off the variance of the estimate of 
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Figure 8.6: Probability distributions of e^e+ energy split as a function of total in- 
cident energy. The vertical scale is arbitrary; each curve is normalized. The vertical 
offset is for clarity. Labels indicate total incident 7-ray energy. At the highest en- 
ergies, asymmetric energy splittings are relatively more likely; for most energies of 
interest to GLAST , the probability of all sphttings except the most extreme ones is 
approximately equal. 

the electron track direction. To find the error in our estimate of the incident 7-ray 
direction, we need to combine our estimates of the two electron tracks. If Ei is the 
energy of one electron and E2 is the energy of the other, our estimate of the incident 
7-ray direction is 

Then the variance in 6-y must be 



The variance in our estimate of the incident 7-ray direction depends on the variances 
of the two electron track directions, and on the energy split between the electrons. 
We can calculate the variances of the two electron tracks from the Kalman filtering 
equations. Since our estimates of C do not depend on the data, wc can compute the 
variances ahead of time. The variances will depend on the electron energy (through 




{Ei + E2y 




CHAPTER 8. RECONSTRUCTING EVENTS 



159 



Qfc, the multiple scattering covariance matrix given by equation (|8.37|) ), the amount 
of radiator on each plane (also through Q^), the ratio of the strip pitch to the gap be- 
tween planes (through the measurement error V and the propagation matrix F) , and 
the number of planes with measurements. Thus the variance depends on four vari- 
ables, making it difficult to display in a single plot. Given probability distributions 
for some of these variables, we may marginalize some or all of them in a Bayesian 
way (§ |2.6.3| ), leaving us with a representative "average" point-spread width. Ta- 
ble |9.2| shows the calculated width of the point-spread function for several interesting 
configurations of the full GLAST instrument. 

It should be noted that all of these calculated point-spread widths assume that 
every quantity is normally distributed. In particular, they ignore the large tails on 
the multiple scattering distribution, and they combine the estimates and variances 
of the two electron directions as if they were Gaussian distributed. They ignore the 
effects of electrons leaving the tracker, which will shorten the average track length. 
As such, they are useful for estimating instrument performance, but would not be 
suitable for likelihood analysis of GLAST data. 



8.10 Extended Kalman Filters 



In § |8.3.4| we linearized the propagation matrix F. That is, we chose to use the track 



slope in our state vector, and assumed that the additional slope from multiple scat- 
tering would be normally distributed, and would add to the previous slope value. 
Of course, the multiple scattering angle is (roughly) normally distributed, and the 
additional scattering angle should be added to the previous track angle. For small 
angles (<30°), this is a reasonable approximation. Since all the beam test 7-rays 
were incident from the same direction (0°), this approximation was valid. However, 
for GLAST , the 7-rays will be incident from every direction, and a small-angle ap- 



proximation will not be valid. There has been extensive work done [£9| |90| to extend 
Kalman filters to non-linear propagators. This would allow the electron track angle 
to be kept in the state vector. A detailed discussion of such methods is beyond the 
scope of this work. 



Chapter 9 



Instrument Response 



Once the various parts of the beam test instrument had been buih, and the software 
to analyze the data had been written, the instrument was placed in the beam line 
in End Station A. The data taken at SLAG during October 1997 yielded important 
insights into issues of backsplash self- veto in the ACD as well as the establishment of 
the feasibility of pointing resolution using the calorimeter only [|169|| . In this chapter, 
we will focus on the silicon tracker: the specific adaptations necessary to analyze 
tracker data and the instrument parameters measured. 



9.1 Alignment 

To avoid systematic errors in reconstructed particle direction, corrections were made 
to account for possible misalignment of the planes. Machining errors on the scale of 
the strip pitch (236 /im) would make it appear that a track had scattered more (or 
less) than it actually had. Smaller errors would influence track reconstruction in a 
statistically similar way. In part to ameliorate this effect, data was taken with the Cu 
radiating foil removed, so that beam electrons were directly incident on the detector. 
For 25 GeV beam electrons, multiple scattering is negligible. These tracks could thus 
be used to "align" the planes by finding the relative offset of each plane. The offsets 
were used to correct the positions of the hits in software. 

The straight electron tracks were fit with a line in each projection. In each plane. 
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the median value of the fit residual was taken as the plane offset. All strip positions in 
the plane were corrected, and the process iterated until the median residuals converged 
to zero. Because the strip measurements were quantized, the linear fits were subject 
to severe aliasing on the scale of the strip pitch. To alleviate the effects of this ahasing, 
a small amount of random noise was added to each strip position independently for 
each fit. 

The tracker was found to be nearly ahgned as constructed. The largest offset was 
~250 /im. The alignment procedure measured the strip positions to an accuracy of 
~50 /im. 

9.2 Cuts 

The SLAC main electron beam runs nominally at 120 Hz. The state of the beam 
test instrument, including all strips hit, all energy deposited in the calorimeter, and 
all ACD tiles hit were read out for each beam spill. With an average of one electron 
per pulse, approximately 30% of spills had no electrons in them at all. Spills with 
one or more electrons shed bremsstrahlung photons with a probability dependent on 
the thickness of the Cu radiator foil (3.5% Xq, 5% Xq, or 10% Xq). Therefore only a 
fraction of the 2.1 x 10^ triggers recorded on tape were useful. A filtering program (not 
to be confused with Kalman filtering) was developed to extract the useful triggers. 
The criteria for accepting a trigger as a useful event were the detection of hits in 
three successive tracker planes, or of more than 6 McV for low gain or 160 McV for 
high gain in the calorimeter. These criteria were adopted so as to be sure to accept 
any particle that passed through the tracker (whether or not it hit the calorimeter) 
as well as any event that did not interact with the tracker, such as a 7-ray that did 
not convert until it entered the calorimeter. The three-in-a-row requirement for the 
tracker was designed to ensure that random noise hits in the tracker would not pass 
the acceptance criteria. When the Cu foils were in place to produce 7-rays from the 
main electron beam, approximately 20% of the triggers were accepted. When the foil 
was removed, and the electron beam was directly incident on the instrument, nearly 
all the triggers were retained. 
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These "useful events" were then analyzed by the reconstruction software. Many of 
them had no interactions in the calorimeter and were thus thrown away. In addition, 
any event with one or fewer hits in either projection was thrown away. The remaining 
events were reconstructed, but not all of these were satisfactory to be included in the 
analysis. The first requirement was that the hodoscopic calorimeter reported only 
one electron in the spill. The presence of multiple beam electrons in the spill greatly 
increases the chance of multiple bremsstrahlung 7-rays entering the tracker at the 
same time. Second, each ant i- coincidence tile was required to have less than 1/4 MIP 
(minimum ionizing particle; see ||^ for details) of energy deposited. While the ACD 
is designed to detect charged particles, it represents 1% Xq of material, which can 
cause 7-rays to pair convert. If a 7-ray pair converts in an ACD tile, each electron 
will deposit 1 MIP, times the fraction of the tile through which the electron passes. 
For example, if the 7-ray converted halfway through the tile, then the two electrons 
would deposit 2x(l/2xlMIP) = 1 MIP. The 1/4 MIP threshold will therefore reject 
all events where a 7-ray converts in the top 7/8 of the thickness of the tile. Of course, 
there are two layers of scintillator on the top of the instrument, so any conversions 
in the first layer will deposit 2 MIP in the second layer. Thus this cut eliminates 
15/16ths of the events which convert in the ACD, as well as any charged particle 
events. 

In addition, cuts were made based on the characteristics of the tracks themselves. 
All tracks were required to have at least three real hits, exclusive of "virtual" hits 
placed on noisy strips or outside of the tracker. The total track was divided by 
the number of hits in the track, and this reduced w^is required to be less than 
5. The final cut demanded that all tracks start at least 4.7 mm (20 strips) from the 
edge of the active area of the tracker. Electrons from 7-rays that convert that close 
to the edge are likely to exit the tracker preferentially, and will introduce a bias to 
the distribution of reconstructed 7-ray directions. The efficiency of each cut is given 



in Table d.l 



Monte Carlo cuts. In an effort to make the beam test data as directly comparable 
with Monte Carlo simulations as possible, the Monte Carlo data was subjected to very 
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Cut Triggers Kept 
42-55 % 
57-84 % 
77-87 % 
73-82 % 

89-96 % 

All Cuts 15-25 % 

Table 9.1: Cut efficiencies 

similar cuts. The Monte Carlo included an anti-coincidence system, and a similar cut 
was made to reject events which converted in the plastic scintillator. All of the cuts 
based on track parameters were made in the exact same way for both the Monte Carlo 
and the beam test data. Since most of the Monte Carlo simulations were done with 
incident 7-rays drawn from a bremsstrahlung energy spectrum, there was no need to 
make cuts to ensure only one 7-ray in the tracker. 

The particular cuts made were chosen to simplify analysis of the beam test data. 
They are not meant to represent the types of cuts that will be made on for GLAST . 
GLAST will be faced with a very different environment in space, replete with back- 
ground particles, albedo 7-rays, and signal 7-rays incident from all directions. It 
also has a very different geometry, which will require combining data from different 
towers and accounting for gaps and support structure. The relevant feature of the 
cuts made here are that they are very nearly identical for the beam test instrument 
and the Monte Carlo simulations. This will allow direct comparison of their results. 
However, the instrumental parameters measured for the beam test instrument will 
not be directly scalable to GLAST. 



Hodoscope 
ACD 

Three Hits 
Edge 



9.3 Expected Beam Test Point-Spread Widths 

. Once the various cuts have been applied to the data sets, distributions of the 
reconstructed angle can be compared. Before we compare these results, it may be 
useful to consider the theoretical widths we expect to find, based on the Kalman filter 



covariances. In § ^.9| we saw that we could calculate the expected point-spread width 
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Figure 9.1: Point-spread width dependence on total energy and energy splitting 
fraction. Darker shades indicate larger point-spread width. At lower energies, the 
width is nearly constant as a function of the splitting fraction. At high energies, the 
point-spread function is much larger for asymmetric splittings. The beam test was 
always in the low-energy limit for this purpose. 

of a given instrument configuration independent of the actual measured data, under 
the assumption that all distributions involved are Gaussian. In order to estimate the 
point-spread widths we will measure with the beam test instrument, we must properly 
average the point-spread width over all the different classes of events we expect to 
measure. Recall that the point-spread width a{E, f,p,d,x, N) is a function of the 
7-ray energy E, the fraction / of that energy in one electron, the silicon strip pitch 
p, the gap d between adjacent planes, the total amount x of material in the radiator 
and the detector in each plane, and the number of planes which measure the given 
event. For any instrument configuration, strip pitch p, gap d and radiator x are fixed. 

To facilitate comparison, we will look at events with approximately equal energies. 
Therefore we must average over a range of electron energy splittings and numbers 
of planes traversed. The point-spread width is an almost constant function of / 
below 5 GeV. Above 5 GeV, the point-spread width is significantly smaller for even 
energy split (Figure 9A), although such a splitting is comparatively less common^ 



(Figure |8.6|) . Therefore, we will assume that the point-spread width is approximately 

*The energy splitting is assumed here to be symmetric, as described by the Bethe-Heitler formula 
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independent of the energy splitting fraction /. The number of planes through 
which the e~e"'" pair travel depends on the total number of planes in the instrument 
(6) and in which plane the 7-ray converted. The probability that a 7-ray will convert 
in a given layer is given by equation (|8.3| ) with t/Xo equal to the thickness of the SSD 
plus the Pb radiator for the plane in question. The intensity of the 7-ray beam at 
that plane is given by equation ( |8.2| ), with t/Xo equal to the total number of radiation 
lengths of material above the plane in question. This represents the beam attenuation 
from the planes above. Therefore, the probability of a conversion in plane n is 

P{n) = (1 _ e^/s^) (9.1) 



where x is the total radiator in each plane. To calculate the average point-spread 
width, we wish to weight the point-spread width for each number of planes by the 
probability that the conversion happened in that plane. Recalling that we accept 
only events with at least three hits, we find the probability that a 7-ray converted in 
one of the top four planes: 

4 

P(n < 4) = ^ P(n) (9.2) 

n=l 

Therefore, the probability that any given pair-production event converted in plane n 
is 

P(n|l < n < 4) = P(n)/P(n < 4) (9.3) 

Plugging in, we arrive at the weights we will apply to the variances measured on each 
plane. 

-|(n-l)a; 

P{n) = J ^ (9.4) 

ELiexp (-|(n- l)xj 

It remains only to examine the point-spread width o" as a function of total 7- 
ray energy E for each instrument configuration. The nature of the Kalman equations 
(§ ^.3| ) makes the calculation of the point-spread width tedious. However, it is a simple 
matter to compute electronically . 



p8[ . Below about 3.3 MeV, asymmetries arise because the nucleus attracts e while repelling e+. 
The asymmetric differential cross sections have been worked out by 0verb0, Mork, and Olsen |154[. 
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Figure 9.2: x-projected point-spread widths for Pancake configuration with no Pb 
radiators (left) and 4% Pb radiators (right). Circles indicate the 68% containment 
width, and squares indicate the 95.5% containment width. Error bars are 2a statisti- 
cal errors, and shaded regions represent the 2a confidence regions of the Monte Carlo 
estimates. The line is an estimate of the point-spread width from the Kalman filter. 

9.4 Conclusions 

While our theoretical calculations make some simplifying assumptions which may 
not be entirely accurate, we hope that the Monte Carlo simulations will more com- 
pletely represent the actual data. For example, the gismo code has a sophisticated 
multiple scattering model which is much more accurate than the simplistic Gaussian 
assumption made above. Furthermore, the Monte Carlo simulations included all the 
particular geometric elements of the beam test instrument. 



9.4.1 Comparison of Beam Test and Monte Carlo Results 

The measured point-spread widths for both the Monte Carlo simulations and the 



beam test data are shown for each instrument configuration in Figures and p73 



There is good agreement between the two out to the 95% containment radius. Two 
example distributions are shown in Figure U^. 



Three features of the Kalman filter estimate are relevant. First of all, in general 
the Kalman estimate is lower than the measured widths. Second, the slope of the 
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Kalman estimate with energy is steeper. Finally, the Kalman estimate reaches an 
asymptotic limit faster, and the limit is significantly lower. 

The Kalman estimates are lower than the measured widths in general because 
the Kalman estimates are based on Gaussian errors. Furthermore, the point-spread 
widths estimated by the Kalman filter are the standard deviations; that is, they 
correspond to the 68% containment radius of the point-spread function assuming it 
is Gaussian. As we have noted, the multiple scattering distributions have significant 
tails, and the measurement error distributions are more square than Gaussian. These 
effects tend to degrade the fit quality, leading to larger point-spread widths. 

The slope of the Kalman estimate is very nearly 1/E at low energies; this is 
expected since it is based on a Gaussian multiple scattering model with the width 
inversely proportional to the energy. On the other hand, the actual beam test in- 
strument is very narrow. At low energies, this leads to a strong self-coUimation 
effect. 7- Rays whose electrons initially make large scatters would be reconstructed in 
a wide instrument with large apparent incident angles. These events cannot be re- 
constructed with the limited data from the narrow instrument, and are thrown away. 
As we expect, stretch configuration displays more self-coUimation than pancake, and 
high-radiator configurations (with larger scattering) also display more self-collimation. 

At high energies, the width of the point-spread function is dominated by the 
measurement error. The assumptions made for the Kalman filter are not particularly 
good in this regime. The Kalman filter assumes that the position measurement is 
continuous, with Gaussian errors. In fact, the measurement is a discrete one, with 
roughly square errors. The effects of the square error distribution have been discussed 
in §8.5| . The effects of the discrete measurement are more difficult to assess. Since the 
tracker planes were very nearly aligned, and the 7-ray beam was very nearly aligned 
with the strip grid, the measured distributions are subject to aliasing. If the planes 
are perfectly aligned, then the minimum non-zero half-angle which can be measured 
by the instrument in pancake mode is 1/2 x 236 /im/150.0 mm f» 0?05. That assumes 
that the 7-ray converts in the middle of a strip, and that the electron continues down 
the instrument, hitting the same strip in each plane until the last one, by which time 
it has drifted half the strip pitch and activates the next strip. In fact, we see in 
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Figure 9.3: x-projected point-spread widths for Stretch configuration with no Pb 
radiators (left) and 4% Pb radiators (right). Circles indicate the 68% containment 
width, and squares indicate the 95.5% containment width. Error bars are 2a statisti- 
cal errors, and shaded regions represent the 2a confidence regions of the Monte Carlo 
estimates. The line is an estimate of the point-spread width from the Kalman filter. 

Figure |9.2| that the Kalman estimate is well below 0?05 above 5 GeV. Since the point- 
spread function for the beam test instrument and for the Monte Carlo simulations is 
quantized at that level, it is not surprising to find that its width is somewhat larger. 

These aliasing effects will actually be helped by the slight offsets in the instru- 
ment planes acquired during the launch of GLAST, assuming that they can be well 
measured using cosmic rays. In addition, the additional planes will act to reduce the 
aliasing. Perhaps more importantly, GLAST will be illuminated from a large fraction 
of the sky. In scanning mode, the bearing to a source in instrument coordinates will be 
constantly changing, serving to smear the aliasing over all scales. These effects should 
push the performance of GLAST closer to the Kalman estimate. In any case, since 
the Monte Carlo correctly models the aliasing, simulations of GLAST from physically 
reasonable sources will make accurate predictions of the GLAST point-spread width. 

A final difference between the instrument as tested and simulated and the idealized 
Kalman assumptions concerns the geometry of the detectors. The Kalman estimate 
assumes that the multiple scattering occurs in the same place as the the measurement, 
when in fact most of the multiple scattering occurs in the Pb radiators, which are a few 
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Figure 9.4: Full distributions of the reconstructed projected incident 7-ray angle. 
Left side is pancake configuration, 4% Pb radiators, 100-200 MeV. Right side is 
stretch configuration, no Pb radiators, 5-10 GeV. Vertical scale is renormalized for 
comparison of distributions. 



millimeters away from the detectors. Simulations of the baseline GLAST instrument 
suggest that as a rule of thumb, the point-spread width is degraded by 3.5% per 
millimeter of separation between the radiator and detector |]T5[. The separation in 
the beamtest instrument was about 2 mm, which would suggest a degradation of 
about 7% from the theoretical Kalman estimate. 



9.4.2 Implications for GLAST 

The agreement between the Monte Carlo simulations and the beam test data is clearly 
encouraging to the GLAST collaboration. The Kalman filtering formalism appears to 
succeed admirably at reconstructing electron tracks. Kalman estimates of the GLAST 
point-spread width for various configurations are given in Table |9.2| . 

The equivalent Kalman estimates of the beamtest instrument point-spread width 
proved to be a reasonable estimate of the instrument performance. However, some 
care must be taken to interpret these results in light of their implications for GLAST . 
Primarily, it is important to remember that the point-spread widths measured by the 
beamtest may not be used as an estimate of the GLAST point-spread width; they 
are two separate instruments with different characteristics. Furthermore, the beam 
test instrument was measured in a controlled, low background beam environment. 
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180 


0.0% 


0.91 


0.11 


0.023 


180 


3.5% 


2.0 


0.22 


0.035 


180 


5.0% 


2.3 


0.25 


0.038 


240 


0.0% 


0.91 


0.12 


0.029 


240 


3.5% 


2.0 


0.22 


0.040 


240 


5.0% 


2.3 


0.26 


0.043 


400 


0.0% 


0.93 


0.13 


0.044 


400 


3.5% 


2.0 


.24 


0.051 


400 


5.0% 


2.3 


0.27 


0.054 



Table 9.2: la widths of the point-spread function as calculated from the Kalman 
formalism, for different silicon strip pitches and amounts lead radiator (given in radia- 
tion lengths), assuming 16 planes space 30.0 mm apart. The method used to generate 
these results is given in § |8.9| . 

while GLAST will be operating in space, bombarded by charged particles and albedo 
7-rays. 

However, this does not mean that these results are irrelevant to GLAST. The ver- 
ification of the Monte Carlo code implies that simulations of GLAST with glastsim 
should yield accurate estimates about the final instrument parameters without build- 
ing many expensive prototypes. Promises made by glastsim may be reasonably 
expected to be fulfilled by GLAST. 



Appendix A 



SSB Arrival Time Corrections 



There are four components to the corrections to Solar System Barycenter (SSB) Time 



from UTC time in equation ( |4.6| ) |P3| , |112|| . The first is due entirely to timekeeping 
convention, and translates UTC to a monotonic sequentially indexed time known as 
International Atomic Time: 



^convention = ^ + 32^84 (A.l) 

where k is the integral number of leap seconds since 1972. 

The second correction is made for the position of the observatory. Since the pulsar 
is much further away from the barycenter than the observatory is, the difference in 
path lengths is simple. Dividing by the speed of light, we arrive at the correction to 
the arrival time: 

^location = (^-2) 

where fi is a unit vector pointing at the pulsar, and Vho is the vector between the SSB 
and the observatory. In practice, this is the vector sum of the vector from the SSB 
to the center of the Sun, the vector from the Sun to the Earth, and the vector from 
the Earth to CGRO. The position vectors for the Earth and Sun are taken from the 
standard ephemeris published by the Jet Propulsion Laboratory [|185|| . 

The remaining two corrections are relativistic in nature. The first is the Einstein 
delay ^Einstein- "^^^^ is the manifestation of the old adage that "heavy clocks run 
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slowly." It depends only on the depth of the potential well in which the observatory 
sits, not the path that a 7-ray takes to arrive there. Therefore, the Einstein delay 
is included in the JPL ephemeris. There is an additional term that results from the 
ellipticity of the Earth's orbit of the form \^-rbo/c^. The portion of this term due to 
the Earth's position is already included in International Atomic Time; we need only 
to add the part due to the satellite position: 

^Einstein = ^Einstein (JPL) + (^-3) 

The last correction is the Shapiro delay ^Shapiro' '^^^^ results from the delay 
induced by the gravitational potential in the region of the Sun ||18CI|| and goes as 



^Shapiro ^ 1^(1 + cos e) (A.4) 

where 6 is the angle between the vector from CGRO to the Sun and the vector from 
CGRO to the pulsar. 

Combining all these corrections, the arrival time in the SSB frame as a function 
of the measured UTC time is given by: 

U = tuTC + + 32!184 + (l/c)n-ro + 
^Einstein + (Vc')v©-reo + 

{2GMq/c^) ln(l + cos e) (A.5) 



The code to make these adjustments to EGRET arrival times was written by Joe 
Fierro |^ , based on calculations and data found in []T^ |7^, |18(J| , |185| , |187|| and ||188 . 



Appendix B 

Summary of Kalman Filtering 
Equations 



Define the first state vector Xq and covariance Co 
For each plane k in the instrument: 
Project from the last plane: 



Ck,proj — Ffc_iCfc_iF 



T 

k-1 



Q 



Filter the estimates: 



fc-i 



-1 



Cfe — {Ck,proj) ^ + H^GfeHfe 

Xfc — Cfe {Ck^proj) ^^k,proj + H^Gfelllfc 



Starting at the second-to-last plane and working back up: 
Smooth the estimates: 



^kF'ki^k+l,proj 



^k,smooth ^k ~l~ -^ki^k+l, smooth ^k+l,proj^ 
^k,smooth — Cfc -|- A-ki^k+l, smooth ~ ^k+l,proj)-^k 
^k,smooth — ^^k,smooth Hk^kjSmooth 



^^k, smooth 
,2 _ „T 



Vi. — Hi-C 



k^k, smooth 



H 



^k,smooth^k,smooth^k,smooth 
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Appendix C 

Track Finding Algorithm 



While the Kalman filter offers a way to quickly and automatically find the best fit 
electron track to a given set of detector hits, identifying which hits belong to the track 
is not a science, but an art. We have experimented with a number of algorithms, and 
have empirically fine-tuned this one. This is the algorithm used by t jrecon for finding 
tracks. 

• Load the data from one event into the data structures. 

• For each projection: 

1. Create a new track: 

— Initialize all the matrices and arrays 

- Assign first hit to be leftmost hit in first layer with hits 

- Assign second hit to be leftmost hit in second layer with hits. 

— If there are < 1 hits in this projection, go to next projection 

- Assign initial slope to first hit 

- Assign half the calorimeter energy to the track 

— Assign the multiple scattering matrix for that energy as the Covariance 
matrix. 

2. Create a second track, using leftmost hit in first layer with hits, rightmost 
hit in second layer. If there are no hits in the second layer, continue down 
until you find a layer with hits. 

3. Find the hit for each layer of the first track: 

— Project state vector and C matrix to next layer 
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— Assign the hit for that layer: 

* If the track projects out of the tracker, add a virtual hit outside 
the tracker, and assign that hit to the track. 

* If there is a dead strip within ±10 strips, add a virtual hit on the 
dead strip. Don't assign it yet. 

* Find the hit closest to the projected track 

* If there is only one hit on that layer, assign it to the track. 

* If the closest hit already belongs to another track, take the next 
closest hit. Assign it to the track. 

* If the closest hit is unclaimed, assign it to the track. 

— Revise ("filter") the projection of the state vector and C matrix in 
light of this measurement. 

4. Smooth the first track: Starting at the bottom, apply the Kalman smooth- 
ing equations to the state vectors and C matrices 

5. Find the hit for each layer of the second track. 

6. Smooth the second track. 

7. Untangle the tracks by swapping hits from one track to the other so they 
do not cross. 

8. Check free hits for first track: 

— For each level, starting at the top, for each unclaimed hit, swap the 
track's current hit with the unclaimed hit. 

— Filter the track, (giving a new set of predicted locations at each level 
below the current one) 

— For each successive level, check to see if the new predictions suggest 
different hits in the track. If so, and if they are real hits (not virtual 
ones), assign them to the track. Filter. Continue to the bottom of the 
tracker. 

— Smooth the new track. 

— Compare the of this new track with the original. If it's lower, keep 
the new track. Otherwise revert to the old track. 

— Start the whole game over from scratch on the next level down. 

9. If the track changed at all in the above step, check free hits again, this 
time starting at the bottom. 

10. Check free hits for the second track, starting at the top. 

11. If the second track changed, check free hits again, this time starting at the 
bottom. 
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12. Starting at the bottom, check to see if swapping hits helps: 

— For each level, swap the hits between the two tracks. Filter and smooth 
both tracks. 

— If the new is smaller, keep the new tracks. Otherwise revert. 

13. Make sure tracks don't cross between first and second planes. If the two 
tracks do not share their first hit, it is hkely that the tracks don't really 
cross. Filter and smooth both tracks. 

14. Check to see if one of the tracks may have left the tracker: (If so, the two 
tracks may currently share hits on all layers below the exit point) 

— Find the lowest level on which the tracks share hits. If they don't 
share any, then quit. 

— If the first shared hit is at least the third hit in the track, create a 
virtual hit outside the tracker in the direction of the current track. 
Assign it to the track. 

— Remove all successive hits from the track. 

— Filter and smooth. 

— Check free hits again, to see if, in this new configuration, the track 
would prefer different hits upstream. 

— If the if smaller, keep the new tracks. Otherwise revert. 

15. Repeat all for the other projection. 

• If we don't have four good tracks, then throw the event away. 

• If at least one of the four tracks leaves the tracker: 

— Find the track that leaves the tracker earhest. 

— Find the track in the other projection with the larger x^- We will claim 
that this track is the other projection of the track that leaves. 

— Remove all hits from that track below the exit point of it's opposite- 
projected track. 

• For each projection: 

— Vary the energy split between the two tracks. 

— Filter and smooth with the new energy split. 

— Find the split fraction that minimizes x^. 

— In practice, this always assigns all the energy to one track. Assign 75% of 
the energy to that track, 25% to the other. 

• Write tuple containing relevant information. 
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